[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

jirapos...@reviews.apache.org (JIRA) Mon, 18 Apr 2011 09:56:49 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021101#comment-13021101
 ]


jirapos...@reviews.apache.org commented on HBASE-1512:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/585/#review493
-----------------------------------------------------------



/src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
<https://reviews.apache.org/r/585/#comment1002>

    done



/src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
<https://reviews.apache.org/r/585/#comment1003>

    done
    



/src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
<https://reviews.apache.org/r/585/#comment1004>

    done



/src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
<https://reviews.apache.org/r/585/#comment1005>

    done



/src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
<https://reviews.apache.org/r/585/#comment1006>

    ok, using only Bytes.toLong now.



/src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
<https://reviews.apache.org/r/585/#comment1007>

    done



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
<https://reviews.apache.org/r/585/#comment1008>

    done



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
<https://reviews.apache.org/r/585/#comment1009>

    removed the repeatition in the doc.



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
<https://reviews.apache.org/r/585/#comment1010>

    Column interpreter is more genericised now. It supports HBase cell data 
type and its promoted data type. For doing these computations, we will use this 
promoted data type. So, in case a cell value is int, we will be using a long 
type while computing sum to handle overflow. In case of finding max and min, we 
will still use the cell data type.



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java
<https://reviews.apache.org/r/585/#comment1011>

    coprocessor implementation returns a over all sum and row count, so no need 
to use a double/float in the return type. It is used at the Client side 
(AggregationClient).



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
<https://reviews.apache.org/r/585/#comment1012>

    done



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
<https://reviews.apache.org/r/585/#comment1013>

    added class java doc.



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
<https://reviews.apache.org/r/585/#comment1014>

    refactored it



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
<https://reviews.apache.org/r/585/#comment1015>

    setting start/end rows does it. So, no need of checking now.



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
<https://reviews.apache.org/r/585/#comment1016>

    done



/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java
<https://reviews.apache.org/r/585/#comment1017>

    now, current version returns a Pair<List<S>, Long> where the list contains 
the sum and sum of squares and Long is the row count. I can have a more 
specific object, but it seems it has to be added in the rpc stack (implementing 
Writable). Please comment if that is _the_ right way.



/src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java
<https://reviews.apache.org/r/585/#comment1001>

    You mean to rename the class or just the javadoc (sorry I missed this 
review statement initially)



/src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java
<https://reviews.apache.org/r/585/#comment1018>

    This class is more genericised now. It defines two parameters type <T,Sbq. 
, where T is the cell value type and S is the promoted data type. S is used for 
doing arithmetic computations, T is used for finding min, max operation.


- himanshu


On 2011-04-13 08:37:14, Ted Yu wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/585/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-13 08:37:14)
bq.  
bq.  
bq.  Review request for hbase and Gary Helmling.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  This patch provides reference implementation for aggregate function 
support through Coprocessor framework.
bq.  ColumnInterpreter interface allows client to specify how the value's byte 
array is interpreted.
bq.  Some of the thoughts are summarized at 
http://zhihongyu.blogspot.com/2011/03/genericizing-endpointcoprocessor.html
bq.  
bq.  Himanshu Vashishtha started the work. I provided some review comments and 
some of the code.
bq.  
bq.  
bq.  This addresses bug HBASE-1512.
bq.      https://issues.apache.org/jira/browse/HBASE-1512
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
/src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java
 PRE-CREATION 
bq.    
/src/main/java/org/apache/hadoop/hbase/client/coprocessor/LongColumnInterpreter.java
 PRE-CREATION 
bq.    
/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateCpProtocol.java 
PRE-CREATION 
bq.    
/src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocolImpl.java 
PRE-CREATION 
bq.    
/src/main/java/org/apache/hadoop/hbase/coprocessor/ColumnInterpreter.java 
PRE-CREATION 
bq.    /src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggFunctions.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/585/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  TestAggFunctions passes.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Ted
bq.  
bq.



> Coprocessors: Support aggregate functions
> -----------------------------------------
>
>                 Key: HBASE-1512
>                 URL: https://issues.apache.org/jira/browse/HBASE-1512
>             Project: HBase
>          Issue Type: Sub-task
>          Components: coprocessors
>            Reporter: stack
>         Attachments: 1512.zip, AggregateCpProtocol.java, 
> AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, 
> patch-1512-2.txt, patch-1512-3.txt, patch-1512-4.txt, patch-1512-5.txt, 
> patch-1512.txt
>
>
> Chatting with jgray and holstad at the kitchen table about counts, sums, and 
> other aggregating facility, facility generally where you want to calculate 
> some meta info on your table, it seems like it wouldn't be too hard making a 
> filter type that could run a function server-side and return the result ONLY 
> of the aggregation or whatever.
> For example, say you just want to count rows, currently you scan, server 
> returns all data to client and count is done by client counting up row keys.  
> A bunch of time and resources have been wasted returning data that we're not 
> interested in.  With this new filter type, the counting would be done 
> server-side and then it would make up a new result that was the count only 
> (kinda like mysql when you ask it to count, it returns a 'table' with a count 
> column whose value is count of rows).   We could have it so the count was 
> just done per region and return that.  Or we could maybe make a small change 
> in scanner too so that it aggregated the per-region counts.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

Reply via email to