Hey Ted, I was in the process of comparing insert throughputs which we discussed using ycsb.What I could find is that when I split the data into multiple column families the insert through is coming down to half when compared to persisting into a single column family.Do you think this is possible or am I doing some thing wrong.
-Nishan On Thu, Sep 25, 2014 at 11:56 AM, Ted Yu <yuzhih...@gmail.com> wrote: > There should not be impact to hbase write performance for two column > families. > > Cheers > > On Thu, Sep 25, 2014 at 10:53 AM, Nishanth S <nishanth.2...@gmail.com> > wrote: > > > Thank you Ted.No I do not plan to use bulk loading since the data is > > incremental in nature. > > > > On Thu, Sep 25, 2014 at 11:36 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > For #1, do you plan to use bulk load ? > > > > > > For #3, take a look at HBASE-5416 which introduced essential column > > family. > > > In your query, you can designate the smaller column family as essential > > > column family where smaller columns are queried. > > > > > > Cheers > > > > > > On Thu, Sep 25, 2014 at 9:57 AM, Nishanth S <nishanth.2...@gmail.com> > > > wrote: > > > > > > > Hi everyone, > > > > > > > > This question may have been asked many times but I would really > > > appreciate > > > > if some one can help me on how to go about this. > > > > > > > > > > > > Currently my hbase table consists of about 10 columns per row which > in > > > > total has an average size of 5K.The chunk of the size is held by > one > > > > particular column(more than 4K).Would it help to move this column > out > > > to a > > > > different column family when we do reads.There are cases where we > just > > > need > > > > to access the smaller columns and there is another set of use cases > > > where > > > > you need both the data(the one in smaller column and this huge data > > > > chunk).In general I am trying to answer the below questions in this > > > > scenario. > > > > > > > > > > > > 1.Would seperating to multiple column families affect hbase write > > > > performance? > > > > > > > > 2. How would if affect my read performance considering both the read > > > cases? > > > > > > > > 3.Is there any advantage that I am gaining by seperating into > multiple > > > cfs? > > > > > > > > > > > > I would really appreciate if any one could point me in the right > > > > direction. > > > > > > > > > > > > -Thanks > > > > Nishan > > > > > > > > > >