Hi Dalia, I think you can make a small sample of the table to do the test, then you'll find what's the difference of scan and count. because you can count it by human.
Best regards, Andy 2012/12/24 Dalia Sobhy <dalia.mohso...@hotmail.com> > > Dear all, > > I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000 > rows with "renal". > > When I type this in Hbase shell, > > import org.apache.hadoop.hbase.filter.CompareFilter > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > import org.apache.hadoop.hbase.filter.SubstringComparator > import org.apache.hadoop.hbase.util.Bytes > > scan 'patient', { COLUMNS => "info:diagnosis", FILTER => > SingleColumnValueFilter.new(Bytes.toBytes('info'), > Bytes.toBytes('diagnosis'), > CompareFilter::CompareOp.valueOf('EQUAL'), > SubstringComparator.new('cardiac'))} > > Output = 50,000 row > > import org.apache.hadoop.hbase.filter.CompareFilter > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > import org.apache.hadoop.hbase.filter.SubstringComparator > import org.apache.hadoop.hbase.util.Bytes > > count 'patient', { COLUMNS => "info:diagnosis", FILTER => > SingleColumnValueFilter.new(Bytes.toBytes('info'), > Bytes.toBytes('diagnosis'), > CompareFilter::CompareOp.valueOf('EQUAL'), > SubstringComparator.new('cardiac'))} > Output = 100,000 row > > Even though I tried it using Hbase Java API, Aggregation Client Instance, > and I enabled the Coprocessor aggregation for the table. > rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan) > > Also when measuring the improved performance on case of adding more nodes > the operation takes the same time. > > So any advice please? > > I have been throughout all this mess from a couple of weeks > > Thanks, > > > >