Asaf, The heap barrier is something of a legend :) You can ask 10 different HBase committers what they think the max heap is and get 10 different answers. This is my take on heap sizes from the many clusters I have dealt with:
8GB -> Standard heap size, and tends to run fine without any tuning 12GB -> Needs some TLC with regards to JVM tuning if your workload tends cause churn(usually blockcache) 16GB -> GC tuning is a must, and now we need to start looking into MSLab and ZK timeouts 20GB -> Same as 16GB in regards to tuning, but we tend to need to raise the ZK timeout a little higher 32GB -> We do have a couple people running this high, but the pain out weighs the gains(IMHO) 64GB -> Let me know how it goes :) On Tue, Apr 30, 2013 at 4:07 AM, Andrew Purtell <apurt...@apache.org> wrote: > I don't wish to be rude, but you are making odd claims as fact as > "mentioned in a couple of posts". It will be difficult to have a serious > conversation. I encourage you to test your hypotheses and let us know if in > fact there is a JVM "heap barrier" (and where it may be). > > On Monday, April 29, 2013, Asaf Mesika wrote: > > > I think for Pheoenix truly to succeed, it's need HBase to break the JVM > > Heap barrier of 12G as I saw mentioned in couple of posts. since Lots of > > analytics queries utilize memory, thus since its memory is shared with > > HBase, there's so much you can do on 12GB heap. On the other hand, if > > Pheonix was implemented outside HBase on the same machine (like Drill or > > Impala is doing), you can have 60GB for this process, running many OLAP > > queries in parallel, utilizing the same data set. > > > > > > > > On Mon, Apr 29, 2013 at 9:08 PM, Andrew Purtell <apurt...@apache.org > <javascript:;>> > > wrote: > > > > > > HBase is not really intended for heavy data crunching > > > > > > Yes it is. This is why we have first class MapReduce integration and > > > optimized scanners. > > > > > > Recent versions, like 0.94, also do pretty well with the 'O' part of > > OLAP. > > > > > > Urban Airship's Datacube is an example of a successful OLAP project > > > implemented on HBase: http://github.com/urbanairship/datacube > > > > > > "Urban Airship uses the datacube project to support its analytics stack > > for > > > mobile apps. We handle about ~10K events per second per node." > > > > > > > > > Also there is Adobe's SaasBase: > > > http://www.slideshare.net/clehene/hbase-and-hadoop-at-adobe > > > > > > Etc. > > > > > > Where an HBase OLAP application will differ tremendously from a > > traditional > > > data warehouse is of course in the interface to the datastore. You have > > to > > > design and speak in the language of the HBase API, though Phoenix ( > > > https://github.com/forcedotcom/phoenix) is changing that. > > > > > > > > > On Sun, Apr 28, 2013 at 10:21 PM, anil gupta <anilgupt...@gmail.com > <javascript:;> > > > > > > wrote: > > > > > > > Hi Kiran, > > > > > > > > In HBase the data is denormalized but at the core HBase is KeyValue > > based > > > > database meant for lookups or queries that expect response in > > > milliseconds. > > > > OLAP i.e. data warehouse usually involves heavy data crunching. HBase > > is > > > > not really intended for heavy data crunching. If you want to just > store > > > > denoramlized data and do simple queries then HBase is good. For OLAP > > kind > > > > of stuff, you can make HBase work but IMO you will be better off > using > > > Hive > > > > for data warehousing. > > > > > > > > HTH, > > > > Anil Gupta > > > > > > > > > > > > On Sun, Apr 28, 2013 at 8:39 PM, Kiran <kiranvk2...@gmail.com > <javascript:;>> > > wrote: > > > > > > > > > But in HBase data can be said to be in denormalised state as the > > > > > methodology > > > > > used for storage is a (column family:column) based flexible schema > > > .Also, > > > > > from Google's big table paper it is evident that HBase is capable > of > > > > doing > > > > > OLAP.SO where does the difference lie? > > > > > > > > > > > > > > > > > > > > -- > > > > > View this message in context: > > > > > > > > > > > > > > > http://apache-hbase.679495.n3.nabble.com/HBase-and-Datawarehouse-tp4043172p4043216.html > > > > > Sent from the HBase User mailing list archive at Nabble.com. > > > > > > > > > > > > > > > -- > > > Best regards, > > > > > > - Andy > > > > > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein > > > (via Tom White) > > > > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > -- Kevin O'Dell Systems Engineer, Cloudera