Re: HBase mention in VLDB keynote

Andrew Purtell Tue, 25 Aug 2009 09:40:01 -0700

> Can we write him to figure more on how evaluation was done?

This was one interaction with that group, maybe the only other one aside from a 
question about sizing memstore: 
http://osdir.com/ml/hbase-user-hadoop-apache/2009-07/msg00552.html 
Now I wonder if the eval was done via the REST gateway... A followup might be 
useful. If I run into someone from Yahoo Research here I'll ask. Otherwise we 
should try mailing them, yes.

> Should we try and get into VLDB next year?

We can certainly submit a candidate paper given a novel contribution of some 
kind which moves the state of the art forward. There are other venues besides 
VLDB also we can consider. Regardless, I think one of us should attend VLDB 
every year. 

> Any thing else interesting at the conference?

Yes. 

ETH Zurich presented a system which tailors consistency to the needs of various 
data items -- "consistency rationing in the cloud: pay only when it matters" -- 
choosing eventual (session) consistency or pessimistic 2PC on demand according 
to a cost model, with good results. Made me think of possibilities with THBase. 
Also, I watched a demo of HIVE, something I hadn't see to date. Their query 
planner and mapreduce scheduler is interesting in concept and in detail. We're 
looking at Cascading for batch analytics on top of HBase instead, but knowing 
more about alternatives is always good.

The Hadoop-y track is really tomorrow. 

Outside of direct relevance to things HBase I attended talks on aspects of data 
fusion, ETL, and complex event processing / stream processing, wearing my TM 
hat. Lots of good stuff here.

   - Andy

________________________________
From: Stack <saint....@gmail.com>
To: "hbase-user@hadoop.apache.org" <hbase-user@hadoop.apache.org>
Sent: Tuesday, August 25, 2009 4:47:57 PM
Subject: Re: HBase mention in VLDB keynote

The same fella did keynote at apachecon eu on a similar topic.  Then he talked 
mostly of Sherpa/pnuts yahoo tech.   In that presentation we got no mention.  
There the comparison strangely was to couchdb and perhaps Cassandra (iirc).

So, mention is an improvement (do you think the kick up the behind I rendered 
him after his amsterdam talk could have had anything to do with it?).

Can we write him to figure more on how evaluation was done?

Should we try and get into vldb next year?

Good stuff Andy.  Any thing else interesting at the conference?

Stack

On Aug 25, 2009, at 6:17 AM, Andrew Purtell <apurt...@apache.org> wrote:

> In this keynote address here at VLDB 2009 (http://vldb2009.org/?q=node/22) 
> Raghu Ramakrishnan, Yahoo! Research's Chief Scientist, made prominent mention 
> of HBase, much to my surprise (and later chagrin). This happened near the end 
> of the talk when a number of the new elastic/scalable/"nosql" storage systems 
> were discussed to make concrete some of the architectural and data model 
> points made earlier. The alternatives considered were Yahoo's PNUTS, sharded 
> MySQL, HBase, and Cassandra. I don't know what version of HBase was used 
> exactly but unfortunately the message was "not ready yet". Perhaps it was a 
> configuration or provisioning issue but HBase did not really survive the 
> evaluation, leading to short hyperbolic performance curves terminating on the 
> far left of the various graphs. This was quite disappointing to see as the 
> other alternatives were apparently successfully tested on what can be 
> presumed to be the same resources. It stands to reason there
 is opportunity for HBase to improve here if only we know what that is. It was 
also a little disappointing that it appears through a mailing list search that 
these issues were not brought to either hbase-dev@ or hbase-users@, only a 
minor question relating to the REST interface. Perhaps the community could have 
identified a specific configuration problem, recommended a correction for a 
deployment/provisioning error, or resolved a bug. To future evaluators of 
HBase, on behalf of the community I humbly request that you share you results, 
good or bad, so we can take the feedback, or the bug reports and their 
artifacts (logs, etc.) and improve our software.
> 
> At least, the story has already changed from what was presented today -- for 
> example, the multimaster architecture of 0.20 was not presented, rather the 
> older one (circa 0.19); and JG's/Ryan's performance test results for 0.20 
> stand as a contradiction. We should look into opportunities to produce a peer 
> reviewed positive contribution. I think we have opportunities to take some 
> novel approaches in the system itself and/or produce a novel vertical 
> contribution and 0.20 is a good substrate for that.
> 
> Though this was unfortunately a missed opportunity for a good showing for 
> HBase in particular, the keynote in general was a well formulated 
> introduction of the emerging area of "cloud scale" storage / "nosql" systems 
> to the largest elite gathering of database and data processing researchers in 
> the world. The presentation was importantly also a call for participation in 
> the future development and directions of the new and growing "nosql" 
> constellation. Such participation, whether it is specific involvement with 
> the HBase project or not, would be and is most welcome as the problems of 
> serving data at very large scale under "cloud" constraints is an area of both 
> significant challenge and significant promise. HBase like other projects in 
> this area are in an early stage of development. They cover the use cases of 
> their creators but, as answers to the larger set of problems, they are not -- 
> that space is untapped and only waiting for creativity and effort. I
 think I can speak for HBase in particular, we welcome this and would be 
pleased to assist at every opportunity.
> 
>    - Andy
> 
>

Re: HBase mention in VLDB keynote

Reply via email to