OpenMind 2007: Monty on the Future (and Past) of Databases
After a break, Tommi Mikkonen of the Tampere University of Technology introduces Monty Widenius (who is, as most readers of this blog will know is one of the founders of MySQL AB)
Monty takes the stage wearing a suit - a nice suit - something I don't recall having seen before.
He starts with an overview of the near past of DBMSs, talking about the state of databases around 1995, covering the state of proprietary and open products around this time.
He then quickly moves to discussing the rise of databases in web apps in the mid-to-late ninties. He's covering a lot of metaphorical ground pretty quickly, perhaps too quickly for a crowd that may not be familiar with DBMS or web apps. If I had been thinking, I would have sat in the back of the room so that I could see how well-suited the content is the for the audience.
Monty has started discussing how the use of databases changed around the turn of the millennia - databases became much larger, dynamic websites became much more critical to business (and consequently the database behind the dynamic websites became much more important.) Initially, people focused on scaling up - buying larger hardware in order to scale. As time passed (and the bubble burst, killing IT budgets) people began to focus on caching data as a way to use their existing hardware more effectively.
As commodity machines became more popular, people using FLOSS DBMS began to scale by using many commodity servers together - keeping a complete or partial copy of their databases on each machine, allowing read queries to scale. Users of proprietary DBMS tended to move to solutions like Oracle RAC - a more expensive way to get a similar benefit of a commodity cluster.
Monty then talks about memcache, highlighting it as a more advanced type of caching.
Moving on to the mid-2000s, Monty starts talking about how more people are looking for small, easy-to-install databases like Berkeley DB, SQLite and the MySQL embedded server - pausing to mention how MySQL didn't adopt the right strategy for their embedded server.
He then talks about how proprietary vendors start to respond to the competition from FLOSS DBMS by releasing limited functionality versions of their DBMS for free.
The next slide has a funny cartoon of two people looking at a computer, one working and one watching. The caption reads,
No, I'm not backing up our files - I'm just assuming that the FBI is keeping copies.
Monty then starts to talk about how Web 2.0 apps that have open APIs are a lot like database management systems, citing examples like Amazon S3, Google Calendar, Flickr and so on.
Interestingly, MySQL has a storage engine for S3 and for memcache, allowing you to interact with both of these applications from MySQL using SQL.
Monty highlights the risks of having online databases - how you often lose some (or much) of the control of your data in exchange for the convenience of having someone else manage the infrastructure.
The next slides focuses on the future of databases in the next four years - Monty starts by focusing on how commodity hardware will change. Hard disks will keep growing in size, large (gigabyte-sized) flash disks will become available (making the cost of some database operations - like sync - nearly free), multi-core CPUs (8+) and 8+ GB of memory will be common, hard disks will be used much like we used tape in the past - for storage of large, sequentially written data. Monty cites a 32 GB flash hard disk he saw in Japan that was about 400 Euros.
Monty posits that these changes will change what we can do with databases - the old DBMS algorithms focus on working around hardware limitations (like disk write time and so on.) Being able to rely on super-fast flash disks will change what kinds of algorithms make sense. DBMSs must change in order to be able to use this new technology.
As Monty starts to wrap up his presentation, he focuses on these points:
- DBMS users will expect to be able to do almost anything to the database (altering it, repairing it, etc.) while the DBMS is running.
- DBMS implementations must change. FLOSS DBMS will be able to react most quickly to changes in what people want and what hardware offers. They can react quickly, because FLOSS DBMS focus on serving users first and worrying about standards, marketing and so on afterwards.
- Monty thinks that MySQL will be able to adapt quickly because of MySQL's architecture that allows it to easily plug in new backends for the database (like the memcache engine and the S3 engine.)
- Monty mentions the MySQL Cluster database as one example of how the needs of future users are already starting to be met.
- He predicts that memcache will be further extended, becoming more fault tolerant and perhaps even supporting transactions.
- Text search must improve - we can't rely on Google to index everything.
- Replication will change from being mostly single-master and multi-slave to being multi-master and multi-source. For example, one machine could be a backup for all of your online databases.
- Even though FLOSS focuses on release early, release often, upgrades still must be easier.
- There will be even more legacy applications based on old versions of DBMS. Vendors will need to support many more old apps. FLOSS will help with this (especially since FLOSS vendors aren't basing their business models on selling licenses, but instead on selling support, services and value-adds.)
Monty predicts that MySQL and PostgreSQL will continue to vie for the title of best Open Source DBMS. He notes that he has his own opinion, which he won't mention.
SQLite will continue to be small in size, but even more used - especially in small devices.
The need for proprietary databases will continue to shrink.
Monty fields the standard question, "Will MySQL have transactions in MySQL 5.0?" Sigh.
The next question is, "How come if MySQL supports transactions, you can still use SELECT CREATE, even if you have removed a table?" They go back and forth a bit, with Monty explaining that you can use transactional or non-transactional databases engines. The questioner persists, providing an example of using a transaction with a CREATE TABLE statement. Monty explains that many RDBMS don't support rolling back data definition statements (but notes that Oracle can do this.)
The next question is, "How have people reacted about the split between the commercial and free version?"
These are all questions that are about five years too old. :)
p.s. Lukas Smith posts a few thoughts about this post at http://pooteeweet.org/blog/882
Tags: DBMS, Flickr, FLOSS, Google, License, marketing, MIT, monty, MySQL, MySQL AB, Open Source, OpenMind, OSI, PostgreSQL, SQLite, Standards, Tampere, UncategorizedRelated posts
Posted on Tuesday, October 2nd, 2007 at 0:42
You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.
October 2nd, 2007 at 5:26
I stumbled a bit over "They can react quickly, because FLOSS DBMS focus on serving users first and worrying about standards, marketing and so on afterwards." I wrote up my thoughts on this topic over here:
http://pooteeweet.org/blog/882
I also noted that AFAIK Oracle does not support transactional DDL, but PostgreSQL does!
October 2nd, 2007 at 5:40
Hey Lukas,
Thanks for the counter-post. I'll let Monty speak for himself (or correct my notes :)
As for the feature gaffe, the Oracle marketing machine must have gotten to Monty. ;P
Cheers!
–zak
October 2nd, 2007 at 6:20
Well someone in my blog just posted (http://pooteeweet.org/blog/882/883) that Oracle indeed supports rollback for transactions. I will try to figure out what the deal is here ..
October 2nd, 2007 at 12:59
Ok, the Oracle Database flagship product does indeed not support transactional DDL. There is only a niche product Oracle bought from DEC that runs on OpenVMS that does support it. Yawn. Go PostgreSQL :)
October 11th, 2007 at 11:10
[...] Monty on the Future of DBMS [...]