[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

[email protected] (JIRA) Fri, 15 Apr 2011 12:30:52 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020405#comment-13020405
 ]

[email protected] commented on HBASE-1512:
------------------------------------------------------

bq.  On 2011-04-15 19:06:58, Ted Yu wrote:
bq.  > 
/src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java,
 line 84
bq.  > <https://reviews.apache.org/r/585/diff/4/?file=15694#file15694line84>
bq.  >
bq.  >     I think the startKey and endKey can be optional as well.
bq.  >     Basically that means scanning the whole region.
bq.  
bq.  himanshu vashishtha wrote:
bq.      These start-end keys are used to locate the interested regions. Do you 
mean whole _table_? If so, it will be like setting 
HConstants.START_ROW/STOP_ROW which are essentially empty byte arrays.
bq.  
bq.  Gary Helmling wrote:
bq.      This would be a bigger change, but maybe it would make sense to have 
the client pass a Scan object?  Then you could specify start/end row, time 
range, multiple column qualifiers, filter?
bq.      
bq.      It's starting to look like we're duplicating most of these arguments 
when there's already a good way of passing them.  What do you think?

Yes, am wondering why it didn't occur to me before! As a matter of fact, we are 
creating a Scan object at region level. So, with passing the Scan object to the 
Aggregation client, it will call the appropriate HTable method (the existing 
one), but the CP's method will take the Scan object as a parameter, and let the 
client have its liberty. But it needs some code changes, like in validation 
stuff for one. 
(I was thinking that it was good to go and now there is so much room for 
improvement. Good stuff).

- himanshu

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/585/#review483
-----------------------------------------------------------

On 2011-04-13 08:37:14, Ted Yu wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/585/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-13 08:37:14)
bq.  
bq.  
bq.  Review request for hbase and Gary Helmling.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  This patch provides reference implementation for aggregate function 
support through Coprocessor framework.
bq.  ColumnInterpreter interface allows client to specify how the value's byte 
array is interpreted.
bq.  Some of the thoughts are summarized at 
http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html
bq.  
bq.  Himanshu Vashishtha started the work. I provided some review comments and 
some of the code.
bq.  
bq.  
bq.  This addresses bug HBASE-1512.
bq.      https://issues.apache.org/jira/browse/HBASE-1512
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
/src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
 PRE-CREATION 
bq.    
/src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
 PRE-CREATION 
bq.    
/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java 
PRE-CREATION 
bq.    
/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java 
PRE-CREATION 
bq.    
/src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java 
PRE-CREATION 
bq.    /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/585/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  TestAggFunctions passes.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ted
bq.  
bq.

> Coprocessors: Support aggregate functions
> -----------------------------------------
>
>                 Key: HBASE-1512
>                 URL: https://issues.apache.org/jira/browse/HBASE-1512
>             Project: HBase
>          Issue Type: Sub-task
>          Components: coprocessors
>            Reporter: stack
>         Attachments: 1512.zip, AggregateCpProtocol.java, 
> AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, 
> patch-1512-2.txt, patch-1512-3.txt, patch-1512-4.txt, patch-1512-5.txt, 
> patch-1512.txt
>
>
> Chatting with jgray and holstad at the kitchen table about counts, sums, and 
> other aggregating facility, facility generally where you want to calculate 
> some meta info on your table, it seems like it wouldn't be too hard making a 
> filter type that could run a function server-side and return the result ONLY 
> of the aggregation or whatever.
> For example, say you just want to count rows, currently you scan, server 
> returns all data to client and count is done by client counting up row keys.  
> A bunch of time and resources have been wasted returning data that we're not 
> interested in.  With this new filter type, the counting would be done 
> server-side and then it would make up a new result that was the count only 
> (kinda like mysql when you ask it to count, it returns a 'table' with a count 
> column whose value is count of rows).   We could have it so the count was 
> just done per region and return that.  Or we could maybe make a small change 
> in scanner too so that it aggregated the per-region counts.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

Reply via email to