On September 28, 2021, at 15:00 CEST, evalink talos users experienced login issues and were unable to access the alarm management platform. The disruption was caused by partial outages in Auth0 - the underlying authentication and authorization platform provider, ongoing in the EU and the US regions from 15:00 to 18:00 CEST.
At 16:30 CEST on September 28, 2021, Sitasys provided a workaround so that users could login again. At 17:45 CEST on September 28, 2021, the service was fully restored and all users were able to successfully access the evalink talos platform.
After further investigation and to prevent this from happening in the future, Sitasys is actively working on an alternative emergency login solution that does not depend on the Auth0 service. We’ll update all users as soon as this solution will becomes available.
Sitasys became aware of a service disruption in Auth0 at 15:00 CEST on September 28, 2021, impacting multiple users unable to access the evalink talos environment. During the service outage, evalink talos users experienced sudden logouts and were could not re-login to their account.
At approximately 16:30 CEST, Sitasys implemented a temporary workaround to allow users to log in by bypassing the non-functional subsystem in Auth0. Despite the workaround, evalink talos users still experienced unexpected logouts and degraded performance caused by a very high latency in the Auth0 platform.
After closely monitoring of the situation and when Auth0 service was fully restored at approximately 18:00 CEST on September 28, 2021, Sitasys safely disabled the workaround.
Auth0 will provide a Root Cause Analysis (RCA) within 14 days.
How did the service disruption affect users? The Auth0 outage directly affected the login and the partner login of evalink talos and therefore prevented users from accessing the evalink User Interface (UI).
Were other Sitasys systems affected during or in consequence of the outage? No issues were found on all other Sitasys’ systems.
What happened to alarms and signals that were transmitted during the outage? There was no impact on alarms and signals which were still received and stored by evalink talos. Automated workflows were carried out normally.
Were alarm panels connections affected? No, our virtual receivers were operating normally. Connections remained established and were properly monitored at all times. However, outage messages were not able to be viewed or processed manually in the UI. Actions triggered in automated workflows, including but not limited to E-Mail, SMS, Slack, phone calls and alarm escalation were working normally.
Could users view signals and alarms during outage? Only if previously configured: Alarm escalation and site sharing and API continued to work at all times.
What happened exactly with Auth0? Auth0 will provide a Root Cause Analysis (RCA) within 14 days.
To maintain a high performance level that our customers expect from Sitasys and to prevent this issue from recurring, our focus is on continuous learning and improvement. Sitasys is fully committed to minimizing downtime when incidents do occur. We also continually assess and improve our tools, processes, and architecture to provide you with the best service possible.