[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

Himanshu Vashishtha (JIRA) Wed, 30 Mar 2011 21:36:50 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013826#comment-13013826
 ]


Himanshu Vashishtha commented on HBASE-1512:
--------------------------------------------

Thanks for reviewing it Ted.

I will add the constructor. 

yes, I was thinking about this dependency of having a long variable for all 
these methods. But flexibility of using any data type (by converting it to byte 
array) for even a specific column family: column qualifier makes it a bit 
tricky to go for a data type argument. I can have varying number of data types 
even for one CF:CQ combination. Rather I was considering the option to have one 
additional check for int type (4 bytes). But that is just me, will be great 
what others say on it.

For adding the type parameter to the AggregateCpProtocol methods, there will be 
dependency with AggregationClient. Did you try adding it there too (apart from 
its impl).


> Coprocessors: Support aggregate functions
> -----------------------------------------
>
>                 Key: HBASE-1512
>                 URL: https://issues.apache.org/jira/browse/HBASE-1512
>             Project: HBase
>          Issue Type: Sub-task
>          Components: coprocessors
>            Reporter: stack
>         Attachments: 1512.zip, patch-1512-2.txt, patch-1512.txt
>
>
> Chatting with jgray and holstad at the kitchen table about counts, sums, and 
> other aggregating facility, facility generally where you want to calculate 
> some meta info on your table, it seems like it wouldn't be too hard making a 
> filter type that could run a function server-side and return the result ONLY 
> of the aggregation or whatever.
> For example, say you just want to count rows, currently you scan, server 
> returns all data to client and count is done by client counting up row keys.  
> A bunch of time and resources have been wasted returning data that we're not 
> interested in.  With this new filter type, the counting would be done 
> server-side and then it would make up a new result that was the count only 
> (kinda like mysql when you ask it to count, it returns a 'table' with a count 
> column whose value is count of rows).   We could have it so the count was 
> just done per region and return that.  Or we could maybe make a small change 
> in scanner too so that it aggregated the per-region counts.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

Reply via email to