How can things be down this long? What is wrong with the hosting?
Those are the same questions I have for our providers. TLDR is they are working on restoring services but expect to be down for several hours yet.
I’ve personally worked with almost every BizBudding customer. Trust me when I say I’m losing (ok lost) my mind right now. So I thought I’d try to explain the issue and share background on our hosting environment.
I would also recommend not sending any email campaigns for the time being that link back to your website.
Here are a few technical details that describe our hosting environment and what has gone wrong.
Three types of data centers exist today: hardware only when you rent hardware and run your software, colocation where you put your servers in someone’s data center, and virtual private data centers that run in a cloud environment.
We use virtual private data centers that run in three physical data centers throughout the northeast. Our provider creates and manages our cloud environment — replacing drives when they fail and increasing capacity when we need it. We run our servers as virtual machines within the virtual data center. Virtualization is a common practice in hosting environments and data centers. It allows you to move the virtual machine “real-time” between different hardware clusters. This adds hardware-level redundancy.
We run our “web server software” on the servers. We use a proprietary system to manage the web servers. We use Cloudflare to secure the front end of the servers and provide advanced caching.
We limit access to the servers and required encrypted access keys for any laptop or computer that requires file access to the servers.
Our provider is a cloud infrastructure company and is a leader in their field. They provide us with the virtual data center services we use to run our hosting environment — and they also provide the same cloud services we use to hundreds of large businesses — ranging from fortune 500 companies to hospitals. We run in the same northeast data centers as those companies — who are also down right now.
I personally know our cloud infrastructure company. Their former CEO was my mentor when I was starting my technology career many many years ago. I’ve been to family events with several people on their leadership team. I can and have texted and called and communicated with them directly today.
Earlier today they determined that a network peering problem was caused by virtualized route reflectors that had crashed in all their northeast data centers.
I’ve been on the phone with them several times today/tonight trying to be supportive and trying to understand their recovery timeline so that I can communicate that to you.
I just received another phone call from them while writing this email to you.
Here’s what I just learned:
Our cloud service provider has detected unusual activity in a portion of their cloud management infrastructure and thinks the issue with the virtualized route reflectors may be related. Their entire Northeast cloud environment is currently offline, affecting ALL of their cloud customers.
In response to this incident, they have brought in an independent cyber forensics firm to supplement their efforts to investigate the incident and ensure that all services can be safely restored.
They are working through the night to restore service. And I’m staying up to monitor the events. We will provide you with ongoing updates as new information regarding service restoration becomes available.
I’ll send out another email as soon as I know more.