Hi Dalia,

I think you can make a small sample of the table to do the test, then
you'll find what's the difference of scan and count.
because you can count it by human.

Best regards,
Andy

2012/12/24 Dalia Sobhy <dalia.mohso...@hotmail.com>

>
> Dear all,
>
> I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000
> rows with "renal".
>
> When I type this in Hbase shell,
>
> import org.apache.hadoop.hbase.filter.CompareFilter
> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
> import org.apache.hadoop.hbase.filter.SubstringComparator
> import org.apache.hadoop.hbase.util.Bytes
>
> scan 'patient', { COLUMNS => "info:diagnosis", FILTER =>
>     SingleColumnValueFilter.new(Bytes.toBytes('info'),
>          Bytes.toBytes('diagnosis'),
>          CompareFilter::CompareOp.valueOf('EQUAL'),
>          SubstringComparator.new('cardiac'))}
>
> Output = 50,000 row
>
> import org.apache.hadoop.hbase.filter.CompareFilter
> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
> import org.apache.hadoop.hbase.filter.SubstringComparator
> import org.apache.hadoop.hbase.util.Bytes
>
> count 'patient', { COLUMNS => "info:diagnosis", FILTER =>
>     SingleColumnValueFilter.new(Bytes.toBytes('info'),
>          Bytes.toBytes('diagnosis'),
>          CompareFilter::CompareOp.valueOf('EQUAL'),
>          SubstringComparator.new('cardiac'))}
> Output = 100,000 row
>
> Even though I tried it using Hbase Java API, Aggregation Client Instance,
> and I enabled the Coprocessor aggregation for the table.
> rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan)
>
> Also when measuring the improved performance on case of adding more nodes
> the operation takes the same time.
>
> So any advice please?
>
> I have been throughout all this mess from a couple of weeks
>
> Thanks,
>
>
>
>

Reply via email to