RE: Hbase Count Aggregate Function

Dalia Sobhy Tue, 25 Dec 2012 09:55:37 -0800

Is there a problem in letting ID (rowkey) "int" value??


> Date: Tue, 25 Dec 2012 22:44:00 +0530
> Subject: Re: Hbase Count Aggregate Function
> From: ramkrishna.s.vasude...@gmail.com
> To: user@hbase.apache.org
> 
> @Dalia
> 
> I think the aggregation client should work with what you have passed.  What
> i meant in the previous mail was with table.count() and now with
> AggregationClient.
> {code}
> if (scan.getFilter() == null && qualifier == null)
>       scan.setFilter(new FirstKeyOnlyFilter());
> {code}
> 
> So as you have passed the filter then it should work as how the SCVF should
> work.  I can check this out during free time (may be tomorrow).
> If not you can raise a bug.  If it turns to be fine then we can close it
> out otherwise its better we fix it.
> I can understand your urgency in this.
> 
> Regards
> Ram
> 
> 
> 
> 
> 
> On Tue, Dec 25, 2012 at 10:27 PM, <yuzhih...@gmail.com> wrote:
> 
> > RowCount method accepts scan object where you can attach your custom
> > filter.
> >
> > Cheers
> >
> >
> >
> > On Dec 25, 2012, at 8:42 AM, Dalia Sobhy <dalia.mohso...@hotmail.com>
> > wrote:
> >
> > >
> > > Do you mean I implement a new rowCount method in Aggregation Client
> > Class.
> > >
> > > I cannot understand, could u illustrate with a code sample Ram?
> > >
> > >>> Date: Tue, 25 Dec 2012 00:21:14 +0530
> > >>> Subject: Re: Hbase Count Aggregate Function
> > >>> From: ramkrishna.s.vasude...@gmail.com
> > >>> To: user@hbase.apache.org
> > >>>
> > >>> Hi
> > >>> You could have custom filter implemented which is similar to
> > >>> FirstKeyOnlyfilter.
> > >>> Implement the filterKeyValue method such that it should match your
> > keyvalue
> > >>> (the specific qualifier that you are looking for).
> > >>>
> > >>> Deploy it in your cluster.  It should work.
> > >>>
> > >>> Regards
> > >>> Ram
> > >>>
> > >>> On Mon, Dec 24, 2012 at 10:35 PM, Dalia Sobhy <
> > dalia.mohso...@hotmail.com>wrote:
> > >>>
> > >>>>
> > >>>> So do you have a suggestion how to enable/work the filter?
> > >>>>
> > >>>>> Date: Mon, 24 Dec 2012 22:22:49 +0530
> > >>>>> Subject: Re: Hbase Count Aggregate Function
> > >>>>> From: ramkrishna.s.vasude...@gmail.com
> > >>>>> To: user@hbase.apache.org
> > >>>>>
> > >>>>> Okie, seeing the shell script and the code I feel that while you use
> > this
> > >>>>> counter, the user's filter is not taken into account.
> > >>>>> It adds a FirstKeyOnlyFilter and proceeds with the scan. :(.
> > >>>>>
> > >>>>> Regards
> > >>>>> Ram
> > >>>>>
> > >>>>> On Mon, Dec 24, 2012 at 10:11 PM, Dalia Sobhy <
> > >>>> dalia.mohso...@hotmail.com>wrote:
> > >>>>>
> > >>>>>>
> > >>>>>> yeah scan gives the correct number of rows, while count returns the
> > >>>> total
> > >>>>>> number of rows.
> > >>>>>>
> > >>>>>> Both are using the same filter, I even tried it using Java API,
> > using
> > >>>> row
> > >>>>>> count method.
> > >>>>>>
> > >>>>>> rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan);
> > >>>>>>
> > >>>>>> I get the total number of rows not the number of rows filtered.
> > >>>>>>
> > >>>>>> So any idea ??
> > >>>>>>
> > >>>>>> Thanks Ram :)
> > >>>>>>
> > >>>>>>> Date: Mon, 24 Dec 2012 21:57:54 +0530
> > >>>>>>> Subject: Re: Hbase Count Aggregate Function
> > >>>>>>> From: ramkrishna.s.vasude...@gmail.com
> > >>>>>>> To: user@hbase.apache.org
> > >>>>>>>
> > >>>>>>> So you find that scan with a filter and count with the same filter
> > is
> > >>>>>>> giving you different results?
> > >>>>>>>
> > >>>>>>> Regards
> > >>>>>>> Ram
> > >>>>>>>
> > >>>>>>> On Mon, Dec 24, 2012 at 8:33 PM, Dalia Sobhy <
> > >>>> dalia.mohso...@hotmail.com
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> Dear all,
> > >>>>>>>>
> > >>>>>>>> I have 50,000 row with diagnosis qualifier = "cardiac", and
> > another
> > >>>>>> 50,000
> > >>>>>>>> rows with "renal".
> > >>>>>>>>
> > >>>>>>>> When I type this in Hbase shell,
> > >>>>>>>>
> > >>>>>>>> import org.apache.hadoop.hbase.filter.CompareFilter
> > >>>>>>>> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
> > >>>>>>>> import org.apache.hadoop.hbase.filter.SubstringComparator
> > >>>>>>>> import org.apache.hadoop.hbase.util.Bytes
> > >>>>>>>>
> > >>>>>>>> scan 'patient', { COLUMNS => "info:diagnosis", FILTER =>
> > >>>>>>>>    SingleColumnValueFilter.new(Bytes.toBytes('info'),
> > >>>>>>>>         Bytes.toBytes('diagnosis'),
> > >>>>>>>>         CompareFilter::CompareOp.valueOf('EQUAL'),
> > >>>>>>>>         SubstringComparator.new('cardiac'))}
> > >>>>>>>>
> > >>>>>>>> Output = 50,000 row
> > >>>>>>>>
> > >>>>>>>> import org.apache.hadoop.hbase.filter.CompareFilter
> > >>>>>>>> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
> > >>>>>>>> import org.apache.hadoop.hbase.filter.SubstringComparator
> > >>>>>>>> import org.apache.hadoop.hbase.util.Bytes
> > >>>>>>>>
> > >>>>>>>> count 'patient', { COLUMNS => "info:diagnosis", FILTER =>
> > >>>>>>>>    SingleColumnValueFilter.new(Bytes.toBytes('info'),
> > >>>>>>>>         Bytes.toBytes('diagnosis'),
> > >>>>>>>>         CompareFilter::CompareOp.valueOf('EQUAL'),
> > >>>>>>>>         SubstringComparator.new('cardiac'))}
> > >>>>>>>> Output = 100,000 row
> > >>>>>>>>
> > >>>>>>>> Even though I tried it using Hbase Java API, Aggregation Client
> > >>>>>> Instance,
> > >>>>>>>> and I enabled the Coprocessor aggregation for the table.
> > >>>>>>>> rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan)
> > >>>>>>>>
> > >>>>>>>> Also when measuring the improved performance on case of adding
> > more
> > >>>>>> nodes
> > >>>>>>>> the operation takes the same time.
> > >>>>>>>>
> > >>>>>>>> So any advice please?
> > >>>>>>>>
> > >>>>>>>> I have been throughout all this mess from a couple of weeks
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>
> > >>>>>>
> > >>>>
> > >>>>
> > >>
> > >
> >

RE: Hbase Count Aggregate Function

Reply via email to