The flaws with the paper are insanely obvious if you look at them:

- their solution doesn't run on disk.  Many things get faster when you
restrict yourself to RAM/flash
- their solution doesn't scale!  Looks like a shared nothing sharding
with global transaction ordering and no internal locks.

Or am I missing something big here?

I generally find it tiresome when people bash on bigtable, yet their
"awesome" thing doesn't scale to multi-PB databases.  Reminds me of
that "time for an architectural rewrite" which was essentially "if you
do everything in 1 thread/CPU you dont need locks and are faster".
This was just the same thing as far as I can tell from skimming the
paper.

On Thu, Sep 2, 2010 at 12:36 PM, Andrew Purtell <apurt...@apache.org> wrote:
> I've tried to post the below comment twice at
>
>    The problems with ACID, and how to fix them without going NoSQL
>    
> http://dbmsmusings.blogspot.com/2010/08/problems-with-acid-and-how-to-fix-them.html
>
> For whatever reason, it has appeared in the comments section from my 
> perspective briefly twice and then disappeared twice, so I will just post it 
> here, because HBase is mentioned in the article a few times, and ... well, 
> just read. :-)
>
>>>>
>
> Many earlier comments have covered much of what I would say. However, nobody 
> to date has raised an objection to the mildly offensive contention that "the 
> NoSQL decision to give up on ACID is the lazy solution to these scalability 
> and replication issues." Possibly this was not meant in the pejorative sense, 
> but it reads that way. I would argue the correct term of art here is 
> pragmatism, not laziness.
>
> I am a contributor to the HBase project. HBase is an open source 
> implementation of the BigTable architecture. Indeed our system does scale out 
> by substantially relaxing the scope of ACID guarantees. But it is a gross 
> generalization to suggest "NoSQL" is "NoACID", and somehow lazy in the 
> pejorative sense, and this mars the argument of the authors. HBase at least 
> in particular provides durability, row-level atomicity (agree here this is a 
> nice convenient partition), and favors strong consistency in its design 
> choices. In this regard, I would also like to bring to your attention that 
> the authors made an error describing the scope of transactional atomicity 
> available in BigTable -- the scope is actually the row, not each individual 
> KV.
>
> Also, at least HBase in particular is a big project with several interesting 
> design/research directions and so does not reduce to a convenient stereotype: 
> a transactional layer that provides global ACID properties at user option 
> (that does not scale out like the underlying system but is nonetheless 
> available), exploration of notions of referential integrity, even 
> consideration of optional relaxed consistency (read replicas) in the other 
> direction.
>
> Back to the matter of pragmatism: While it is likely most structured data 
> store users are not building systems on the scale of a globally distributed 
> search engine, actually that is not too far off the mark for the design 
> targets of some HBase installations. We indeed do need to work with very 
> large mutating data sets today and nothing in the manner of a traditional 
> relational database system is up to the task. The discussion here, while 
> intriguing, is also rendered fairly academic by the "horrible" performance if 
> spinning media is used. Flash will not be competitive with spinning media at 
> high tera- or peta-scale for at least several years yet. Other commenters 
> have also noticed apparent bottlenecks in the presented design which suggest 
> a high scale implementation will be problematic.
>
> Anyway, it is my belief we are attacking the same set of problems but are 
> starting at it on opposing sides of a continuum and, ultimately, we shall 
> meet up somewhere in the middle.
>
> September 2, 2010 10:55 AM
>
> <<<
>
>   - Andy
>
>
>
>
>
>

Reply via email to