Holy crap! Well congrats folks! Hope to catch up on some of the things I've been missing about Drizzle at the MySQL Conf, woo! Sounds like Drizzle needs to throw a GA party while there :)
Tim On Mar 17, 2011, at 2:00 PM, Planet Drizzle wrote: > Brian Aker: Drizzle goes GA, From "What If", to "What has" > > Not quite three years ago I wrote an article called "What If?". > > What I wanted to do was go back and rethink decisions we had made during the > years, especially decisions that we made for MySQL 5.0. > > 5.0 exists because of the MySQL/SAP alliance. SAP wanted to replace Oracle > with MySQL, and to do that MySQL was going to need to run SAP R3 in order to > do it. We didn't just pay lip service to SAP, there was an effort to make > this happen. Somewhere in the middle of all this there was also a very odd > "we were going to adopt SAPDB as the next MySQL". Which of course was never > going to happen. There were countless meetings over this, and attempts to > somehow sprinkle even an ounce of the SAPDB code into MySQL, but that never > happened. > > As far as making R3 work on MySQL? That was incredibly unlikely, and it was > damaging to the product in the end. We ended up with a lot of features that > the database was never designed to have. We created an unrealistic set of > expectations. We had a source base which had too little testing. > > So part of the goal with Drizzle was to cut it back to the core and build > modules that we could then create better testing for. So for that reason > Stored Procedures, Views, and Triggers were out. None of them were well > designed, and all of them had/have major bugs. We tossed out the monolithic > kernel design and moved to microkernel design. > > MySQL 5.1 made an attempt to patch the replication system that had been > written a decade ago. MySQL replication works, but it works with a lot of > exceptions. Anyone who has ever put it into production is aware of these. The > good thing about MySQL replication is that it mostly works out of the box, > and that is something that was a bit of a revolution when it was created. > Today? With the notable exception of SQL Server, the rest of the major > databases still have replication systems which are difficult to use, install, > or deploy. > > We initially looked at using 5.1's replication. We were only going to > refactor it such that we were going to beef up its file format and switch to > just using the row based replication that was added in 5.1. We were > unsuccessful in refactoring it. About 9 months in we figured this out, and we > began a rewrite. > > The rewrite was the right answer. The original code had too little testing > for us to ever know whether or not a change we made created bugs or not. > > A big lesson learned, if you are going to refactor code, make sure you have > plenty of testing up front. > > Internally we have "new code" and "old code". If we want to make a change to > "new code" we can typically do it very rapidly. The rate at which we can > extend it is pretty amazing. The MySQL code base is not friendly to anyone > who knows C++. Pretty much all of the warnings have been disabled and there > are a lot of tricky bits. > > We have fixed all the warnings in Drizzle. This is something that isn't sexy > work, and the only way it is justified is because cleaning up warnings fixes > bugs. If you are starting a new code base let me implore upon on you the > necessity of doing this from the beginning. > > Today our replication is pretty spiffy, and it answers a couple of the big > "What If" statements I have wanted answered: > > 1) We use an entirely open message format. > 2) We store our replication records directly in Innodb. > > The open message format comes with a penalty, it is more verbose than a > native format. It takes up more space than if we just shipped the block > records created in the transactional engine. Running a point in time recovery > on block records is tricky and very limiting. You can't take the data from > one database and push it to another. ETL? Forget it. > > We used Google's Protobuffer for the message format. There are other > libraries available but they were either license incompatible or were not > widely known. At the time we hadn't made a decision to go with boost so using > its serialization library wasn't an option. The disadvantage has been that > the Google library created a dependency for installing Drizzle. Dependencies > are a pain, and when we started Drizzle I had thought that the different > Linux distributions had a good handle on this, I don't really believe this > any longer. Avoid dependencies. > > Storing the records in Innodb has always seemed to be an easy win in my mind. > It solves a lot of the two phase commit problems that plague users and it > gives you instant recovery. Storing the log in a separate file can possibly > give you a win in that you can do some tricks with IO, but in the end it just > complicates everything. > > With MySQL you always need to keep in mind the question of "What would MyISAM > do?" > > MyISAM's design, and limitations, are scattered throughout the program. In > all cases MySQL has to ask "how will this be handled if we need to store data > in a storage engine that can't handle failure, handles all of its own IO, and > needs to be locked at the Table level?". > > We dropped MyISAM support about a year into our work, and relegated it to a > support only roll for temporary tables. We didn't hide it completely before > we GA'ed Drizzle, but we won't support it long term. I've heard users say > "but I want its performance!". Trading performance for reliability works out > for some people, but certainly not everyone. What I find is that when someone > wants this, what they really want is a different sort of database all > together. Typically it is some sort of analytics problem which creates this > need. > > Which gets us to the storage engine interface. It was within MySQL the first > attempt to create an interface that we could plug different solutions in. I > had proposed it in MySQL because I had written different engines and knew > what a nightmare it was to make it work. > > That engine interface has generated millions of dollars. When I wanted to > make it available at MySQL the backlash was significant. Some of sales > freaked out, some of marketing thought we were going to let others take over > the product, and alliances wanted to know how we were going to limit it to > "select partners". On top of that, half of engineering wanted to go and > re-engingeer it immediately. > > In Drizzle we have spent a significant amount of time reworking the > interface, but it is far from perfect. We redesigned it so that engines now > own their own meta data and federate that data to the microkernel. We also > designed the interface to require that all new engines have ACID like > qualities, know how to handle their own recovery, and can handle failure > gracefully. Our core engine is Innodb. We have had others propose new > engines, and we have even supported other engines, but at the end of the day > we know people want a transactional engine mainly because they don't want to > find that their data has been trashed. > > Our Innodb is a little different. We have more views into the state of the > engine, and we fixed our version to compile with a C++ compiler. We cleaned > up warnings and fixed the bugs that popped up from that. We have begun to > refactor it so that it is more integrated with Drizzle's thread scheduler. > > Innodb would have been the default engine for MySQL long ago if not for some > "not invented here" mentality, mixed with a flopped buy out attempt. Heikki, > the inventor of Innodb, came out quite well in all of this. Good for him. > > I don't believe we will spend much more energy on the storage engine > interface going forward. It is a dead business, and while there are a couple > of companies that have built brand and product enough to make a go of the > business I don't expect any additional ones will show up. The storage engine > business made money for MySQL, but it was a big distraction. While with > Drizzle is easier to integrate an engine, I'm not sure that a business exists > for storage engine vendors with it. I'll write more about this at a later > date. > > Speaking of dates, Drizzle's internal format for timestamps is 64bit. There > is still some work to be done to allow to use all 64bits, but you won't need > to recompile or change your disk format for them. Right now we need to fix > some tests, and make sure a couple of functions will handle the formatting, > but we store your data such that going forward, or backwards, you are in good > shape. Unlike MySQL we store time as GMT, so there is no screwing around with > "well I stored my data in my local time zone, but we had the machine set > too...". I have personally spent over a month of time just fixing bugs in > that code. > > We have spent a lot of time fixing bugs. We get a big collective smile on our > faces when we read about new bugs that have been discovered in MySQL, when we > discover that we don’t have the bug. We have spent a lot of time fixing bugs, > and a considerable amount of our time has been spent on finding new ways to > test Drizzle. I am sure plenty of bugs exist to be found. > > We also support storing/comparing/displaying time with microseconds. We also > have a real BOOL type, which I have been told is handy for the SQL Alchemy > folks, and a native ISO UUID type. The UUID is interesting in that it stores > time as well as being unique. It isn't as fast as "please give me the next > number", but I believe it will be useful for a lot of applications. We have > refactored all of the types, and the only one that was not size related that > we dropped was SET. If you wonder why we dropped it read the section in the > MySQL manual about its limitations and bugs. > > Why do we allow only DELETE against a single table at a time, like pretty > much every other RDBMS? Beyond the conceptual issue that few can wrap their > heads around how to form, let alone feel like they know what the query will > do, we hit the problem of the "multiple execution path". There were a lot of > one off execution paths in MySQL. In a lot of cases I know these were dead > refactoring projects that were never completed. The "multiple execution path" > problem is particularly disturbing when you think about fixing a bug. If you > fix a bug in DELETE you need to know that there is an execution path for a > single table that is different then the path used for multiple tables. This > leads to odd behaviors, and a much richer set of bugs that exist. > > SQL modes? Those are gone. If you wonder about what sort of problem they > create inside of the server, I'd suggest reading about the "Legend of the > Ambalappuzha Paal Paayasam". > > In general in Drizzle we have tried to get rid of Gotchas that we have found. > Things like declaring a column NOT NULL and discovering that somehow the > database still stored a NULL is gone. Altering a table and adding a field > that would violate the structure of the table? That is gone. > > It is amazing that ALTER TABLE works, as the code there is Byzantine. We have > made some effort to clean it up, but it is still way too tricky. I wish we > could have done more there, but it is what it is. Are you using partitions? > Make sure you back up your data before doing an ALTER TABLE. Wrapping > partitions into the system in the way that was done at the time was simple, > but it is far from robust. > > I had hoped that with 5.1 we would have created a single logging API, but > instead we ended up with multiple logging API internally in the server. With > Drizzle we ripped all of them out and installed a single API. It is crazy > simple to write a new logging plugin. > > Which gets into the philosophy of plugins in general. Writing plugins should > be low hanging fruit. Whenever possible we have tried to make it the case. > > We have an entirely new INFORMATION_SCHEMA in Drizzle. It is based in table > functions, which is a new concept in Drizzle. We keep a separate schema > called DATA_DICTIONARY, in it we put whatever we like. Our INFORMATION_SCHEMA > is only what the SQL standard has specified. We do zero vendor modification > to it. Another hats off to SQL Server, their INFORMATION_SCHEMA is the > closest to complying with the standard. > > Drizzle's drivers are BSD. They were written outside of Sun, and Sun signed > off on contributions to them under a BSD license. They speak Drizzle and > MySQL's protocol. There is a JDBC version that was written. Their adoption is > becoming wide spread. Licensing clarity around them is a big win for us, and > for ISV who want to integrate. MySQL's licensing mess was related to a lot of > hand waving involving them. Recently I noticed that MongoDB had written up a > clear licensing policy with regards to their own drivers. Awesome. > > We never got to finish all, or really much of any, of what we wanted to do > with the Drizzle protocol. I believe this is an area where we will see change > in the near future. Internally inside of drizzle we have a C++ interface that > resembles JDBC that lets us execute queries. We will be doing a lot more with > that interface going forward. > > What about performance? With Drizzle we began doing benchmarks early on, > using a few different benchmarks. The benchmark generated by sysbench has > always been the benchmark we have used as our bellwether. Unlike a lot of > databases we test Drizzle with up to 1024 concurrent executing queries. Most > of the benchmarks I see people run are for far less connections. We have > chosen time and time again to favor performance gains at the high end, over > gains on the low end. We are roughly double in performance from where we > began. We could still do a lot better. MySQL 5.5 has a new meta data locking > system which should do well in a number of situations, we could do a bit > better in some of these cases. Our lack of a MyISAM would make it simple for > us to move forward in this direction if we want too. > > There has been a lot asked about our claim on scaling with lots of cores. Our > process there is simple, eliminate locks, favor performance gains when we > find them that favor of additional CPU, and try whenever possible to remove > strong ownership that require waits for locks. MySQL relies on MyISAM, and > MyISAM has significant locks, especially around the keycache, we got rid of > those by freeing ourselves from MyISAM. We have had some gains with our new > scheduler and we have done some to improve how IO is handled. > > I am sure we have a lot of tuning still to do. We won't be publishing > benchmarks which compare us to others though. I've yet to see a comparison > benchmark which wasn't completely flawed, and even when they are not, few > people really understand them. They fall into the classic "how many angles > does it take to dance on the head of pin" conversations. > > Our authentication system is modular, and we need to iron out more of the > authorization system. > > I've seen someone say that Drizzle is designed for Google and Facebook. This > is not the case at all. We built it so that the next Facebook, Google, etc > would have a platform to build on. Facebook and Google have their own forks > of MySQL, they aren't going to be using Drizzle. The pieces are there for the > next company who needs to innovate, it is just a matter of someone making use > of them. We speak the MySQL protocol, so the typical MySQL application runs > just fine on Drizzle without change. We designed Drizzle to work as a piece > of someone's current infrastructure, not be yet another application which has > a costly integration. We have a NoSQL sort of solution via the blob streaming > module, but we are first and foremost a relational database. > > What will the next Google or Facebook find? A much more friendly platform > than what MySQL provided to develop on and with. The big success for Drizzle > has been in the people that have been involved. We are without a doubt the > descendant of MySQL that has the largest contributor base, and we have long > passed MySQL with regards to contributors. We are well into the hundreds when > it comes to developers who have contributed code. We have had more then 921 > commits in the last month across 20 people. Our numbers go up and down, but > we are consistently more then double anyone else in size. If you just walked > out of college, or skipped it all together, you are going to have a much > easier time adjusting Drizzle to your needs . At least we believe this :) > > The codebase is C++, we make use of Boost, and while we are cautious, we tend > to favor more forward thought in how we code. Readability is the key to > creating code that others will use. Because in the end? We can scale silicon, > but carbon? People are much harder to scale. > > The people to thank for the code: > Brian Aker > Monty Taylor > Stewart Smith > Lee Bieber > Jay Pipes > Padraig O'Sullivan > Andrew Hutchings > Marko Mäkelä > Joe Daly > Olaf van der Spek > Vijay Samuel > Patrick Crews > Toru Maesaka > David Shrewsbury > Eric Day > Zimin > Marisa Plumb > Joseph Daly > Barry Leslie > Asil Dimov > Mark Atwood > Tim Penhey > Jimmy Yang > Paul McCullagh > Nathan Williams > Paweł Blokus > Sunny Bains > Andy Lester > Hartmut Holzgraefe > Trond Norbye > > Other people to thank? > > David Douglas who at Sun supported us initially, and when we didn't think our > internal support initially at Sun could get any better? Bob Brewen worked > with us till the end came for Sun. An extra mention should be made for Lee > Bieber, he has been working with the project from nearly the beginning as > well. He has handled project management, done code refactoring, made flyers, > organized dinners, and did everything else in between. > > Mike Shadle for getting us machines, and making sure everything runs. Adrian > Otto at Rackspace should be thanked (along with a number of other people as > well). > > A thank you should go to Chris Dibona for the Google Summer of Code project. > We have a number of students who now work on databases for a living thanks to > that program. While with MySQL we constantly failed at getting student's code > into the server, with Drizzle we have had a lot of success. > > There is an entire channel of people who have been involved with Drizzle on > Freenode in #drizzle who should be thanked as well. IRC is how we > communicate. > > There are a lot of other people I am forgetting to thank, sorry about that. > > So what next? There is a lot more to Drizzle then what I have written above. > Having worked on this for years I often forget what the differences are > anymore. There are lots of new features, plenty of new enhancements, and new > bugs just waiting to be found. I'm giving a talk at Web 2.0 Expo in a couple > of weeks in San Francisco were I will talk about some of what we have done > and are doing for virtualization. > > I will being giving a keynote at the O'Reilly MySQL Conference & Expo, and > there are a handful of talks there on Drizzle as well. The MySQL Ecosystem is > a radically different place then what it was a year ago, I'll be commenting > on it in the future online and at the conference. > > About a week ago Monty Taylor and I sat down and talked about what we wanted > to do with Drizzle going forward. Monty has been working on this since the > beginning with me, and he has been a lot of fun to work with. One conclusion > that we both came to was that we want to see where people will take Drizzle > before we determine too much about its future. It is easy to get caught up in > new features, and we are interested in seeing how others use it before too > many decisions are made about what to do next. > > Originally posted on blog.krow.net. > URL: http://krow.livejournal.com/700783.html > > _______________________________________________ > Mailing list: https://launchpad.net/~drizzle-discuss > Post to : [email protected] > Unsubscribe : https://launchpad.net/~drizzle-discuss > More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~drizzle-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~drizzle-discuss More help : https://help.launchpad.net/ListHelp

