Approximately 35 minutes after the end of our maintenance window today at 10am UTC, we noticed crashes in our control plane software. It showed up as memory crashes when we crossed 4 times our baseline number.
After investigating, we came to the conclusion that the number of routes increased by 4 times because of a bug in the scaling software that was added in this release cycle.
We decided to revert control plane, increase memory limits by 30% and finished the entire upgrade at 1:55pm UTC. This led to the extension of the maintenance window by another 4 hours.
During this extended maintenance, there were, occasionally, 503 on network requests for VoltStack APIs and workloads running on VoltStack in our network POPs. No other services were affected during this maintenance window.
We thank you, in advance, for your understanding and your collaboration and we remain at your disposal to provide you with any additional information you deem useful.