Surfly (surfly.com)
Incident Report for Surfly Public Cloud
Postmortem

Event details: On 27/05/2021 users experienced degraded service on app.surfly.com

There were some unexpected session disconnects in the EU region and inability to start sessions globally for short durations a few times between 09:59 CEST and 13:02 CEST on 27/05/2021. Slowness of app.surfly.com was also experienced during the same durations.

Root Cause Analysis: With the start of business hours in European region and the ongoing business hours in the Asia region, the usage of app.surfly.com increased rapidly. The load was higher than expected.

  • At the time of receiving first alerts (09:59 CEST), the SSL termination endpoint was unable to carry the load, enabling the load balancer to fail-over to the backup server. 
  • This kept happening and was causing some active session disconnects for users in the European region. 
  • We started investigating the issue and took various measures but the issue kept coming back.
  • It was discovered that the issue was occurring due to an internal limit to the maximum number of allowed HTTPS connections to the servers.

Resolution: 

  • We've increased the connection limit of our HAProxy server to a number more suitable for the size of the server thereby massively increasing the load that it can carry.
Posted Jun 08, 2021 - 18:34 CEST

Resolved
This incident has been resolved.
Posted May 27, 2021 - 13:26 CEST
Monitoring
A fix has been implemented and we are monitoring the results.
Posted May 27, 2021 - 13:02 CEST
Update
We are continuing to work on a fix for this issue. We expect a resolution soon.
Posted May 27, 2021 - 11:43 CEST
Update
We are continuing to work on a fix for this issue.
Posted May 27, 2021 - 10:59 CEST
Update
We are continuing to work on a fix for this issue.
Posted May 27, 2021 - 10:41 CEST
Update
We are continuing to work on a fix for this issue.
Posted May 27, 2021 - 10:31 CEST
Update
We are continuing to work on a fix for this issue.
Posted May 27, 2021 - 10:29 CEST
Update
We are continuing to work on a fix for this issue.
Posted May 27, 2021 - 10:26 CEST
Identified
Outage event for Surfly (app.surfly.com); Some active sessions can get disconnected; Session start is also affected.
Posted May 27, 2021 - 09:59 CEST
This incident affected: General Service Availability.