http://en.wikipedia.org/wiki/ACID
Think maybe you should read this and rethink that position. On Mon, Jul 20, 2009 at 9:59 AM, Eric Newcomer<enewco...@gmail.com> wrote: > Hi Peter, > > The classic definition of consistency as the C in ACID refers to application > level consistency. Storage consistency is ensured using mostly A and I (i.e. > writes are atomic and isolated, or locked). > > The description of "eventual consistency" does indeed refer to what happens > on the storage end of things, but to achieve eventual consistency A and I > are still required when writing to disk (again using the classic definition > of ACID). That's what I meant by saying that the major change in BASE > compared to classic ACID systems is to allow temporary updates to memory, > which can introduce inconsistencies either among replicated memory or > between the memory and persistent storage. I think this is the way the term > "consistency" is used in the BASE model, but the C in ACID refers to > application level consistency. > > I certainly agree BASE style systems are sufficiently proven to work to be > considered mainstream from one point of view, i.e. they are a valid > implementation choice. But the focus of our book is on describing how large > TP systems work, and we could not really say that the majority of large TP > systems in production today use BASE style designs. We do say that BASE > style systems are a future trend influencing TP products and applications, > but it will be a while before the influence is fully realized. > > I also agree with what I consider the main point here, which you describe > as: > > But the idea behind BASE is that you don't need ACID all the time. The > question is "what kind of guarantees do we need for application X?", the > traditional answer (and the one you give here) is "ACID", but sometimes > it should be "BASE", which allows for more performance. > > What I am saying is that using BASE doesn't mean you are not using ACID at > some level, or at some time (or more correctly using the AID properties), > since these properties are used when writing data to stable storage. The big > distinction seems to be at what point in time these properties are used, > i.e. at what point in time the data is persisted. The big difference I see > in BASE style designs is the ability for the application to work with data > in memory without having to persisting every update immediately, thus > gaining a performance advantage at the cost of the potential > inconsistencies. > > Eric > > On Sun, Jul 19, 2009 at 7:31 PM, Peter Becker <peter.becker...@gmail.com> > wrote: >> >> Eric Newcomer wrote: >> > Yes, Ok. I can agree it's a valid comparison at the application >> > level, and a very important one. >> > >> > I'm don't equate ACID with RDBMS. ACID is an entertaining acronym for >> > a set of properties that can (and are) achieved using a wide variety >> > of mechanisms. I agree a lot of BASE style implementations use non >> > SQL databases, or file systems. But this does not mean they are >> > non-ACID. >> Of course there are non-RDBMS, but ACID storage systems. Most modern VCS >> and things like JCR fall into that category. >> > >> > I think the updates are still atomic (A), isolated (I), and durable >> > (D) - at least at some point, if not immediately. (Consistency is >> > actually an application responsibility more than a storage system's >> > responsibility.) >> But the application has to rely on the storage to be consistent first. >> If the value of some property depends on how I query (e.g. which node in >> a cluster I pick), then my application will never be consistent. >> > >> > I would also say that any data storage system implements these >> > properties in some way. Otherwise there are pretty big risks to ending >> > up with partial results, interleaved (overwritten) updates, and lost >> > data. Using SQL and RDBMS style products are definitely not the only >> > way these properties can be achieved. I didn't mean to imply that at >> > all. >> But the idea behind BASE is that you don't need ACID all the time. The >> question is "what kind of guarantees do we need for application X?", the >> traditional answer (and the one you give here) is "ACID", but sometimes >> it should be "BASE", which allows for more performance. >> > >> > I think BigTable is used mostly for query analysis, isn't it? >> AFAIK BigTable is the only storage you get if you use the Google >> Application Engine. >> > And of course inconsistencies in query results is not as big a problem >> > as inconsistencies for updates. But what about the writes to stable >> > storage (disk or flash memory) against which BigTable queries are >> > performed? I would argue these follow AID, despite the fact they >> > don't use an RDBMS. >> Assuming a journaled filesystem they do, but is that relevant? >> > >> > I suppose my objection is somewhat picky, since I completely agree >> > BASE style systems have a lot of advantages, and I would actually >> > recommend that anyone with large IT systems understand what's going on >> > there. I also think these concepts are among those that will enable >> > the low-cost commodity data centers to be used in more types of >> > transaction processing systems. >> > >> > But ACID does not equal RDBMS, nor does it equal 2PC. It is a set of >> > properties that are used to evaulate the capabilities of storate >> > systems. >> I agree. >> > I would say instead that BASE is really just another way to achieve >> > the ACID properties, mostly by introducing asynchronicity between >> > volitaile and persistent storage devices. >> I disagree since a BASE system is not guaranteed to be entirely at any >> point in time. Although it is true if you assume changes to your data >> set of interest stop. >> > The end goal however remains AID. >> > >> > Phil and I have cover the fundamentals of ACID in our TP book, and in >> > the second edition (which came out last month) we also cover many of >> > the alternatives to classic commit protocols that are used to >> > implement the properties, such as sagas, compensations, queuing, and >> > replicated memory. We also cover the CAP theorem, but unfortunately >> > for the timing of the book the entertaining BASE acronym was not yet >> > widely adopted. >> It is amusing, isn't it :-) I am not really sure the paper I cited had >> any other real contribution, but they deserve the credit just for coming >> up with the catchy name. I'm one of these weird people who think Martin >> Fowler's biggest contribution was being one of the inventors of the >> "POJO" term :-) Not that his books are bad, but that name outshines them. >> > Although the "BASE" acronym isn't there, the concepts in it are >> > covered, and we did research the papers and conferences on the topic - >> > I suppose up until about Nov/Dec 2008 or so, when the final manuscript >> > had to be submitted. But I think it's all there, starting from the >> > principles, and ending with implementation examples such as EJB3, JPA, >> > .NET Entities, REST/HTTP, etc. >> > >> > Because the focus of the book is on widely adopted current practice, >> > we could not use product examples to illustrate all of the "BASE" >> > concepts, since everyone seems to be implementing it in a different >> > way, primarily using custom code, and products are only now starting >> > to emerge around these concepts. Nati Shalom of GigaSpaces takes the >> > position that the new generation of products is far enough along to be >> > considered mainstream in his comments to my >> > <a >> > >> > href="http://ericnewcomer.wordpress.com/2009/06/19/second-edition-of-tp-book-out-today/">blog >> > post. </a> >> AFAIK a lot of Amazon's storage is based on the idea, all of GAE is >> based on BigTable (which might imply GMail and others) and there are >> quite a few other large users of CouchDB. Sounds mainstream enough for >> me :-) >> >> Peter >> >> >> > >> > I don't think BASE concepts are very well integrated into products >> > yet, certainly not sufficiently for use throughout the multiple tiers >> > typically used in a large scale TP application, but hopefully it won't >> > be long. However, as with any change as significant as the move toward >> > BASE and eventual consistency models, it is likely to take a long time. >> > >> > Eric >> > >> > >> > >> > On Sat, Jul 18, 2009 at 6:56 PM, Peter Becker <peter.becker.de >> > <http://peter.becker.de>@gmail.com <http://gmail.com>> wrote: >> > >> > >> > While I agree with your description of what BASE is and that usually >> > ACID will be used in the lower layers, I do think the BASE vs. ACID >> > question makes sense as long as you apply it to the application >> > layer, >> > not the whole stack. Traditionally enterprise systems are build with >> > ACID assumptions on the top layer. In many cases that has been >> > replaced >> > with BASE, which on that layer is an either-or decision. Chosing >> > BASE on >> > the top layer does not imply not having ACID anywhere, though. >> > >> > Depending on your replication mechanisms in the storage layer the >> > BASE >> > can go very deep. You seem to imply that all applications use a >> > traditional RDBMS at the bottom, but that is not necessarily true. >> > AFAIK >> > BigTable, CouchDB and the like do not provide you with ACID >> > transactions >> > at all. Data written into one node will eventually appear on the >> > others, >> > but if your webserver hits two different backends it can get >> > inconsistent data. If you build your stack on top of these >> > databases, >> > you can not assume ACID anywhere. >> > >> > But it is important to make sure you ask the ACID vs. BASE question >> > on >> > every layer separately. >> > >> > Peter >> > >> > >> > Eric Newcomer wrote: >> > > Yes, BASE and ACID are different concepts, that's why I suggested >> > a >> > > direct comparison isn't really accurate. BASE systems would use >> > ACID >> > > for persistence to stable storage. >> > > >> > > To me the difference really seems to apply more to the application >> > > level relationship to persistence than the database's. The >> > database >> > > is still going to use ACID transactions for updates to stable >> > storage. >> > > >> > > The idea of BASE systems is to allow the application to update >> > > volatile storage and receive control back immediately, without >> > waiting >> > > for the update to be written to stable storage. This opens the >> > door >> > > to potential inconsistencies when the data is later written to >> > stable >> > > storage, but improves latency. >> > > >> > > A lot of the system designs that fall into the BASE category >> > also rely >> > > on replicated memory for failure handling and load balancing, >> > but this >> > > opens another window to potential inconsistencies among replicas >> > > (since it's impossible to update all replicas simultaneously). >> > > >> > > The concept of "eventual consistency" says that at some point >> > these >> > > kinds of inconsistencies will be reconciled, and there are a >> > variety >> > > of techniques for doing so. None of them guarantee consistency >> > of the >> > > stable storage, since there's always a window of time between the >> > > update to volatile storage and the update to persistent storage in >> > > which interleaving updates can occur, or volatile data can be >> > lost. >> > > However, the risk of inconsistency, and the additional effort to >> > > resolve them "eventually" once they occur is seen to be worth the >> > > improved latency for the majority of cases. >> > > >> > > The point I was trying to make is that even in the BASE style >> > systems >> > > I'm familiar with, ACID transactions are used when the data is >> > > (eventually) persisted to stable storage. The concept of BASE just >> > > returns control to the application after the update volatile >> > storage, >> > > and doesn't wait for the additional time it takes to also >> > perform the >> > > update to persistent storage, therefore reducing latency for the >> > > application/user. >> > > >> > > It's a classic trade off, but its discussion seems to have >> > created an >> > > over simplification of positioning one against the other, as if >> > BASE >> > > were a potential replacement for ACID, which I don't think it >> > is. Good >> > > marketing maybe, but not very accurate. >> > > >> > > Eric >> > > >> > > On Fri, Jul 17, 2009 at 9:32 PM, Peter Becker <peter.becker.de >> > <http://peter.becker.de> >> > > <http://peter.becker.de>@gmail.com <http://gmail.com> >> > <http://gmail.com>> wrote: >> > > >> > > >> > > Eric Newcomer wrote: >> > > > ORM to me is like one of those impossible tasks, like >> > automatically >> > > > converting data types between Java and XML. >> > > I think the latter is actually easier :-) >> > > > >> > > > I think EJB3 is a big improvement over EJB2 and JPA a big >> > > improvement >> > > > over entity beans. We are in the middle of mapping JDBC >> > and JPA to >> > > > OSGi BTW and hopefully this will result in more >> > pluggability for >> > > > persistence providers. >> > > > >> > > > BASE is the kind of thing I was referring to in the >> > earlier post in >> > > > that it represents a persistence design based on a >> > different set of >> > > > assumptions. I would not really agree however with a >> > > characterization >> > > > of BASE vs ACID, since even in the BASE style systems I'm >> > aware of, >> > > > ACID is still used by the databases when persistence >> > happens. The >> > > > difference seems much more about the decision and timing of >> > > > persistence to stable storage than whether BASE is used in >> > place of >> > > > ACID. AFAIK ACID is still used - if what's meant is 2PC >> > then >> > > that is >> > > > probably a more correct comparison, i.e. BASE vs 2PC. >> > > From what I understand BASE and ACID are different >> > concepts. If it is >> > > BASE, it is not ACID -- it doesn't matter if something >> > underneath uses >> > > ACID semantics. If you want ACID at the top, you have to >> > control >> > > it all >> > > the way down. BASE is about giving up some of that control in >> > > favour of >> > > weaker assumptions. Once you did that, you lost ACID from >> > that layer >> > > upwards. >> > > >> > > Here is the relevant paper: >> > http://queue.acm.org/detail.cfm?id=1394128 >> > > >> > > Peter >> > > >> > > >> > > > On Fri, Jul 17, 2009 at 4:34 AM, Peter Becker >> > <peter.becker.de <http://peter.becker.de> >> > > <http://peter.becker.de> >> > > > <http://peter.becker.de>@gmail.com <http://gmail.com> >> > <http://gmail.com> >> > > <http://gmail.com>> wrote: >> > > > >> > > > >> > > > Rick wrote: >> > > > > I think one of the reasons that relational databases >> > are >> > > popular as >> > > > > compared to other solutions is that they map well to >> > the >> > > theoretical >> > > > > tools, such as relational algebra/calculus. >> > > > > >> > > > My problem is that relational databases map to most of >> > the >> > > theory only >> > > > in theory. E.g. SQL does not map to relational algebra, >> > it >> > > is more a >> > > > "Based upon a true story" type of thing. I've done >> > this rant >> > > a few >> > > > types >> > > > before (including on this forum), but one of the >> > things I really >> > > > miss is >> > > > a true implementation of the relational algebra, which >> > includes >> > > > having a >> > > > proper notion of domains (which could easily be mapped >> > to >> > > OO-classes). >> > > > > For an upcoming e-commerce project I suggested >> > trying out >> > > > couchDB (as >> > > > > promoted by the posse) and sCouchDB (the Scala >> > version of >> > > same?).... >> > > > > and a friend with an architectural leaning asked >> > something >> > > along the >> > > > > lines of: >> > > > > >> > > > > "but can you guarantee atomicity?" >> > > > > >> > > > > which shut me up pretty quickly. >> > > > > >> > > > I believe the ACID vs. BASE question will become more >> > > dominant in the >> > > > near future, though. I am somehow afraid that many >> > project will >> > > > pick the >> > > > BASE option when they really need ACID. >> > > > > Disclaimer: I'm a fan of EJB 3.0 >> > > > I've used only JPA, which is really not too bad. They >> > certainly >> > > > seem to >> > > > have learned from the experiences of other products in >> > the >> > > area, which >> > > > is unfortunately not that common with these standards. >> > You still >> > > > need to >> > > > like ORM to like EJB3, though :-) I just find the ORM >> > idea >> > > to be too >> > > > much of a neither here nor there thing. >> > > > >> > > > Peter >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > > >> > > >> > > >> > > >> > > >> > > >> > > > >> > >> > >> > >> > >> > >> > > >> >> >> >> > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "The Java Posse" group. To post to this group, send email to javaposse@googlegroups.com To unsubscribe from this group, send email to javaposse+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/javaposse?hl=en -~----------~----~----~----~------~----~------~--~---