Re: [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management

Andrew Purtell Thu, 30 Aug 2012 09:41:43 -0700

Just want to clarify I mean experimenting with the approach of the Thrift
client work not use of Thrift particularly.


On Thursday, August 30, 2012, Andrew Purtell wrote:

> This paper could very well have benchmarked the relative performance of
> the YCSB drivers. Some take aways for me here are:
>
>     - Cluster setup is too difficult still
>
>     - There are opportunities for autotuning that would make it easier for
> users to get it right the first time and for academics and casual
> benchmarkers alike to get a good result without becoming experts with HBase
> configuration
>
>     - The client library has been evolving toward fully async dispatch, we
> should focus on this, perhaps even consider reimplementing sync client on a
> refactored async core. And look at making the Thrift based stuff FB put in
> front and center, because then native clients are possible.
>
>     - Given the above client work, the YCSB HBase driver should have a
> rewrite.
>
> On Thu, Aug 30, 2012 at 4:49 PM, Dave Wang 
> <d...@cloudera.com<javascript:_e({}, 'cvml', 'd...@cloudera.com');>
> > wrote:
>
>> My reading of the paper is that they are actually not clear about whether
>> or not HMasters were deployed on datanodes.
>>
>> I'm going to guess that they just used default configurations for HBase
>> and
>> YCSB, but the paper again is not specific enough.
>>
>> Why were they using 0.90.4 in 2012?  Would have been nice to see some of
>> the more recent work done in the area of performance.
>>
>> One thing the paper does touch on is the relative difficulty of standing
>> up
>> the cluster, which has not changed since 0.90.4.  I think that's
>> definitely
>> something that could be improved upon.
>>
>> - Dave
>>
>> On Thu, Aug 30, 2012 at 6:27 AM, Cristofer Weber <
>> cristofer.we...@neogrid.com <javascript:_e({}, 'cvml',
>> 'cristofer.we...@neogrid.com');>> wrote:
>>
>> > Just read this article, "Solving Big Data Challenges for Enterprise
>> > Application Performance Management." published this month @ Volume 5,
>> No.12
>> > of Proceedings of the VLDB Endowment, where they measured 6 different
>> > databases - Project Voldemort, Redis, HBase, Cassandra, MySQL Cluster
>> and
>> > VoltDB - with YCSB on two different kind of clusters, Memory-bound and
>> > Disk-bound,  and I'm in doubt about results for HBase since:
>> >
>> >
>> > *         HBase version was 0.90.4
>> >
>> > *         Master nodes were deployed together with data nodes
>> >
>> > *         They didn't reported tuning parameters
>> >
>> > There's also a paragraph where they reported that HBase failed
>> frequently
>> > in non-deterministic ways while running YCSB.
>> >
>> > My intention with this e-mail is to look for opinions from you, who are
>> > more experienced with HBase, on where this experiment's setup could be
>> > changed to improve read operations, since in this setup HBase did not
>> > performed as well as Cassandra and Project Voldemort.
>> >
>> > Here's the article:
>> > http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf and Volume 5
>> > home: http://vldb.org/pvldb/vol5.html
>> >
>> >
>> >
>> >
>>
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>
>

-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management

Reply via email to