Can you give a bit more detail, such as: the release of HBase you're using number of column families where slowdown is observed size of cluster release of hadoop you're using
Thanks On Mon, Sep 29, 2014 at 9:43 AM, Nishanth S <nishanth.2...@gmail.com> wrote: > Hey Ted, > > I was in the process of comparing insert throughputs which we > discussed using ycsb.What I could find is that when I split the data into > multiple column families the insert through is coming down to half when > compared to persisting into a single column family.Do you think this is > possible or am I doing some thing wrong. > > -Nishan > > On Thu, Sep 25, 2014 at 11:56 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > > There should not be impact to hbase write performance for two column > > families. > > > > Cheers > > > > On Thu, Sep 25, 2014 at 10:53 AM, Nishanth S <nishanth.2...@gmail.com> > > wrote: > > > > > Thank you Ted.No I do not plan to use bulk loading since the data is > > > incremental in nature. > > > > > > On Thu, Sep 25, 2014 at 11:36 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > For #1, do you plan to use bulk load ? > > > > > > > > For #3, take a look at HBASE-5416 which introduced essential column > > > family. > > > > In your query, you can designate the smaller column family as > essential > > > > column family where smaller columns are queried. > > > > > > > > Cheers > > > > > > > > On Thu, Sep 25, 2014 at 9:57 AM, Nishanth S <nishanth.2...@gmail.com > > > > > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > This question may have been asked many times but I would really > > > > appreciate > > > > > if some one can help me on how to go about this. > > > > > > > > > > > > > > > Currently my hbase table consists of about 10 columns per row > which > > in > > > > > total has an average size of 5K.The chunk of the size is held by > > one > > > > > particular column(more than 4K).Would it help to move this column > > out > > > > to a > > > > > different column family when we do reads.There are cases where we > > just > > > > need > > > > > to access the smaller columns and there is another set of use > cases > > > > where > > > > > you need both the data(the one in smaller column and this huge data > > > > > chunk).In general I am trying to answer the below questions in this > > > > > scenario. > > > > > > > > > > > > > > > 1.Would seperating to multiple column families affect hbase write > > > > > performance? > > > > > > > > > > 2. How would if affect my read performance considering both the > read > > > > cases? > > > > > > > > > > 3.Is there any advantage that I am gaining by seperating into > > multiple > > > > cfs? > > > > > > > > > > > > > > > I would really appreciate if any one could point me in the right > > > > > direction. > > > > > > > > > > > > > > > -Thanks > > > > > Nishan > > > > > > > > > > > > > > >