Dear yong, How to distribute my data in the cluster ? Note that I am using cloudera manager 4.1
Thanks in advance:D > Date: Fri, 28 Dec 2012 20:38:22 +0100 > Subject: Re: Hbase Question > From: yongyong...@gmail.com > To: user@hbase.apache.org > > I think you can take a look at your row-key design and evenly > distribute your data in your cluster, as you mentioned even if you > added more nodes, there was no improvement of performance. Maybe you > have a node who is a hot spot, and the other nodes have no work to do. > > regards! > > Yong > > On Tue, Dec 25, 2012 at 3:31 AM, 周梦想 <abloz...@gmail.com> wrote: > > Hi Dalia, > > > > I think you can make a small sample of the table to do the test, then > > you'll find what's the difference of scan and count. > > because you can count it by human. > > > > Best regards, > > Andy > > > > 2012/12/24 Dalia Sobhy <dalia.mohso...@hotmail.com> > > > >> > >> Dear all, > >> > >> I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000 > >> rows with "renal". > >> > >> When I type this in Hbase shell, > >> > >> import org.apache.hadoop.hbase.filter.CompareFilter > >> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > >> import org.apache.hadoop.hbase.filter.SubstringComparator > >> import org.apache.hadoop.hbase.util.Bytes > >> > >> scan 'patient', { COLUMNS => "info:diagnosis", FILTER => > >> SingleColumnValueFilter.new(Bytes.toBytes('info'), > >> Bytes.toBytes('diagnosis'), > >> CompareFilter::CompareOp.valueOf('EQUAL'), > >> SubstringComparator.new('cardiac'))} > >> > >> Output = 50,000 row > >> > >> import org.apache.hadoop.hbase.filter.CompareFilter > >> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > >> import org.apache.hadoop.hbase.filter.SubstringComparator > >> import org.apache.hadoop.hbase.util.Bytes > >> > >> count 'patient', { COLUMNS => "info:diagnosis", FILTER => > >> SingleColumnValueFilter.new(Bytes.toBytes('info'), > >> Bytes.toBytes('diagnosis'), > >> CompareFilter::CompareOp.valueOf('EQUAL'), > >> SubstringComparator.new('cardiac'))} > >> Output = 100,000 row > >> > >> Even though I tried it using Hbase Java API, Aggregation Client Instance, > >> and I enabled the Coprocessor aggregation for the table. > >> rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan) > >> > >> Also when measuring the improved performance on case of adding more nodes > >> the operation takes the same time. > >> > >> So any advice please? > >> > >> I have been throughout all this mess from a couple of weeks > >> > >> Thanks, > >> > >> > >> > >>