5 Best Practices for Database Scaling
By Diana Nolting | February 13, 2013
“Database scaling ranks at the top of the list of most common problems software architects are challenged with,” Solutions Architect Jake Robinson explains.
“Relational databases were never designed to scale any distributed model like cloud,” he continues. “Because apps are now expected to be available to anyone in the world, we now need to rethink how we do database design.”
1. Identify the Problem
“The first step to scaling your relational database is to identify where the bottleneck is and what is causing the database to slow or stop working,” explains Robinson. “And the most common answer is storage.”
Robinson continues, “Storage is the slowest piece in a relational database. One bad SQL query can cause a cascading delay on getting the information from the disk. This is especially true when the query isn’t written very efficiently.”
“Frequently the database might not be the issue at all,” Robinson continues. “The application code, an inefficient query or stored procedure is often to blame.”
He adds that “throwing resources at it” might not be the right solution if your problem is related to these issues. Rather than worrying about scaling, you should focus on making your code more efficient.
2. Increase Memory
If your bottleneck is due to storage your first course of action is to give it additional memory. By increasing the amount of information in memory, you aren’t accessing the disk as often so you increase your speed.
“Some have found success by adding not only memory to the database server itself, but also by dedicating an entire server to nothing but cached database information in memory,” describes Robinson. “It can be an effective solution.”
3. Vertical Scaling Tools
One popular tool to help increase the amount of cached database information is Memcached.
Memcached is a free, open-source, high-performance distributed memory object caching system. According to its website, it is “generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.”
Robinson explains this process as vertical scaling your database. “Adding memory, and really adding resources of any kind, is all vertical scaling.”
Vertical scaling allows your system to more effectively use your resources and technology.
“Adding a system like Memcached is more of a hybrid model, because it’s adding memory but not doing it in the same server,” Robinson explains. “The query goes out looking for the key value that it’s trying to get. If it’s not in the Memcached server, it continues on to the SQL server. Because it’s not accessing the actual SQL server directly, it increases speed and efficiency most of the time.”
4. No One Wants a “Shardy” Database
Often companies will try to find an overly complex solution and scale their relational database horizontally through “sharding”.
“Effectively, ‘sharding’ is the act of splitting your database in half,” Robinson says. “It would split your relational database into groups, say ‘A-M’ and ‘N-Z’. Now your app has two databases to deal with,” Robinson explains. “It increases your complexity and your risk. Some people are doing it, but I don’t recommend it because there are better solutions.”
5. The New Ways
According to its website, Cassandra supports “replicating across multiple datacenters… providing lower latency for your users.” Mongo DB’s website states that it is a “scalable, high-performance, open source NoSQL database written in C++.” These NoSQL systems are all highly scalable, distributed geographically and are fault-tolerant. “However,” Robinson cautions, “As with any scalable solution you will increase complexity in some way.”
There are other options that are also an “in-between” option for those who aren’t ready to go no-SQL with their apps. VMware vFabric SQLFire is one such option. On its website it is described as “a data management system with memory speed, horizontal scalability and a familiar SQL interface.” Robinson explains that SQLFire allows for standard SQL queries but can also be distributed geographically.
“SQLFire is a happy medium. You can still use SQL, but it’s designed to be distributed,” explains Robinson. “SQLFire is considered to be the NewSQL.”
Relational databases are quickly becoming legacy for applications that need greater scale because they just aren’t designed for what today’s apps need. There is certainly a need to scale relational databases, but they will never scale to the magnitude some of the new database technologies will.
“You can do data migration from an old relational database into a new NoSQL database, but the biggest hurdle you will have is changing the application to be able to talk to the NoSQL database,,” explains Robinson. “It’s not a relational database anymore; it’s key-value stores.”
These best practices will be useful in helping to keep your current relational databases usable with older applications. Eventually those older applications may be re-architected or redesigned in order to implement new technologies effectively.