This is definitely one of those "right tool for the job" situations. There are a lot of requirements, especially for large scale web sites, that relational databases (and application servers for that matter) can't meet. A lot of this is due to the different assumptions behind "traditional" application infrastructure and web based infrastructures.
Before databases came along we basically had the choice of different file types, structured, indexed, indexed sequential, etc. Databases basically aggregated the file types into a single storage system mechanism and added an abstract programming language for interacting with it. SQL has proven the most successful of those, for a variety of reasons, high among them its ability to dynamically change the storage structure definition (i.e. schema) and separate the relationships among tables (i.e. unlike hierarchical databases, which embedded pointers in the records). Throughout this sort of evolution of programming and storage systems, the core design assumptions did not change significantly, which was that the major goal of a storage system was to persist the data safely and securely before returning control to the application. When the transaction abstraction was introduced in the early to mid 80s storage systems also were able to protect developers from having to manually back out partial updates that could occur during a crash. Newer systems like Google's Big Table and many variations developed by other large web companies changed the assumptions. At HPTS in 2007 for example quite a few of the web companies presented details of their infrastructures, none of which used relational databases and application servers as core elements. (see http://www.hpts.ws/papers/2007/agenda.html for agenda and links to some of the presentations) Andrew Fikes from Google said that the basis of their "scale out" infrastructure is the assumption that all systems are going to fail, regardless of how much money you spend, how big a box you get etc. So if they are going to fail, why not design for failure and use the cheapest possible hardware? The Amazon folks published a paper on their "Dynamo" architecture in which they describe the trade off between latency and persistence (see http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html). (Strangely the Amazon folks were prevented from presenting at HTPS at the last minute, which was pretty embarassing for them.) In other words, the core design assumption of "traditional" relational databases, and other traditional storage systems for that matter, has been to persist data as quickly and reliably as possible since getting the data onto persistent storage was the top priority. As Amazon and other web companies are saying, this is not true for web based businesses - the customer experience is the priority, i.e. the latency of the response to the HTTP GET etc. Because of the different trade offs and different assumptions on REST/HTTP based systems that need to put priority on giving the user a good experience, people started developing memory based systems. I would have loved to have based the second edition of "Principles of Transaction Processing" on these new designs, but as Phil and I discovered, everyone does something different right now. Some common patterns are starting to emerge in products such as Oracle Coherence, GigaSpaces, and IBM ExtremeScale, as Nati Shalom of GigaSpaces argues in comments to my blog post about the book (see http://ericnewcomer.wordpress.com/2009/06/19/second-edition-of-tp-book-out-today), but things have not really settled down to the point where we could talk about coherent designs, systems, and products. Nonetheless this is a really fascinating area, and I think it's not an exaggeration to say that Google is reinventing computing. They are basing their infrastructure designs on commodity data centers, which James Hamilton documents pretty well in many presentations on his web page http://www.mvdirona.com/jrh/work/ (this is the one he presented at HPTS http://mvdirona.com/jrh/talksAndPapers/JamesRH_CommodityDataCenterDesign.ppt ). I can see the potential for commodity data centers overtaking mainframes in the future, but of course this will require application redesigns and system rearchitectures. This all suggests to me a somewhat oversimplified compare and contrast between mainframe based "scale up" designs and commodity data center "scale out" designs. Current middleware and storage system products are still designed around what I call mainframe style assumptions. Eric On Thu, Jul 16, 2009 at 8:21 AM, Martin Wildam <mwil...@gmail.com> wrote: > > On 16 Jul., 12:43, Steven Herod <steven.he...@gmail.com> wrote: > > XML would be the > > 'modern' equivalent. To do it in a relational database with Java > > would involve a lot more work and a lot less flexibility. > > I am working in DMS and ECM field and here there is a wave of products > using alternatives (although sometimes just as an addition). What I > observe is that "stores" that are not structured as tables - stores > that allow the storage of any data (beeing an object, xml or whatever) > increases the "data mess". Many people have difficulties in keeping > their data structured and organized (as many having problems keeping > their household in pretty order ;-) ). A relational database might > give you limitations in freedom how to archive your data but my > observation is that relational databases produce less chaos. - YMMV. > > > On 16 Jul., 12:59, Christian Catchpole <christ...@catchpole.net> > wrote: > > I am more so questioning how the SQL ecosystem > > has evolved as the only serious choice for persistence. > > Are you thinking of "persistence" basically in relation with "objects" > or in general in the way of making your data persistant? > > > > [...] products which try to offer alternatives to relational > > SQL. But I guess none of these ever gain any real traction because it > > all seems so risky to trust any of them with your most valuable > > asset.. your data. Especially if they don't support Crystal Reports. > > I think it is similar to people using lists - or in IT the Excel type > applications. People think in lists and tables when it comes to the > need of getting an overview. Next step is creating graphical views on > the data. So I think, SQL is still a good choice in general for > storing and querying data... > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "The Java Posse" group. To post to this group, send email to javaposse@googlegroups.com To unsubscribe from this group, send email to javaposse+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/javaposse?hl=en -~----------~----~----~----~------~----~------~--~---