Dear all, I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000 rows with "renal".
When I type this in Hbase shell, import org.apache.hadoop.hbase.filter.CompareFilter import org.apache.hadoop.hbase.filter.SingleColumnValueFilter import org.apache.hadoop.hbase.filter.SubstringComparator import org.apache.hadoop.hbase.util.Bytes scan 'patient', { COLUMNS => "info:diagnosis", FILTER => SingleColumnValueFilter.new(Bytes.toBytes('info'), Bytes.toBytes('diagnosis'), CompareFilter::CompareOp.valueOf('EQUAL'), SubstringComparator.new('cardiac'))} Output = 50,000 row import org.apache.hadoop.hbase.filter.CompareFilter import org.apache.hadoop.hbase.filter.SingleColumnValueFilter import org.apache.hadoop.hbase.filter.SubstringComparator import org.apache.hadoop.hbase.util.Bytes count 'patient', { COLUMNS => "info:diagnosis", FILTER => SingleColumnValueFilter.new(Bytes.toBytes('info'), Bytes.toBytes('diagnosis'), CompareFilter::CompareOp.valueOf('EQUAL'), SubstringComparator.new('cardiac'))} Output = 100,000 row Even though I tried it using Hbase Java API, Aggregation Client Instance, and I enabled the Coprocessor aggregation for the table. rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan) Also when measuring the improved performance on case of adding more nodes the operation takes the same time. So any advice please? I have been throughout all this mess from a couple of weeks Thanks,