Hi, we run a SaaS service based on WordPress Multisite and Woocommerce and unfortunately we are hosted by SiteGround. We have about 500 subsites and we run on a Cloud dedicated with this setup:
8 CPUs
34 GB RAM
100 GB SSD
We optimized all the content and scripts that can be optimized and, during the day (when users use the service) we run at 25/30% server capacity with about 10.000 visits per day.
Everything works fine until their auto-backup tool starts (Schedule you can’t avoid, even on managed hosting). So we scheduled it at night (from 1:30 UTC) and it takes about 2 hours to be done. Site starts giving 500 Errors when backup reaches database and server goes down, causing all the connected services to fail (like cron, etc). Server is overloaded for a lot of time, for about 7/8 hours. We can re-start the service usually at 8/9 in the morning. This is clearly not acceptable.From our WPMUdev dashboard we have 78% uptime lol.
After submitting various tickets they told us to reduce DB size because that was the issue. Server got overloaded when trying to backup mySQL server. So we run optimization via WP Cli and reduced DB size from 9GB to 1.6GB, which is, honestly, an adequate size for a wpmu with 500 woocommerce subsites. They keep saying that our mySQL is too huge and that’s te issue, but we have a very powerful server setup, is it possible to really be a 1.6GB mySQL to cause all these issues?
Here’s their final answer:
Indeed during backup generation the website starts displaying database errors and 500 ISE errors. The reason for that is due to the fact that during backup generation or should I say during database dump, the database is locking its tables:
| 581 | up9wqa9qzzs5k | localhost:46468 | dbkjz5p97rxey7 | Query | 95 | Waiting for table metadata lock | INSERT INTO hxl_406_woocommerce_sessions (<code>session_key</code>, <code>session_value</code>, <code>session_expiry</code>) VALUES ( | 0 | 0 |
This is an example query which is being stuck and piling up in the MySQL. Now, since it takes a lot of time for the backup to dump your database, for the entirety of the time the tables remain locked and the processes are unable to execute properly which is why the 500 ISE errors are happening. Practically, the entire issue here is caused by the fact that the database is too large and it takes way too much time for the backup creation tool to dump it.
I’m writing here because we optimized everything with your services and they are perfect, so, have you got any advice for us? Can the db be further optimized or we should look for another hosting provider? I’m not a system administrator, but it is difficult to understand how a 1.6GB mySQL server can crash a 34GB/8CPU server at night with no traffic.