[jira] Commented: (HIVE-607) Create statistical UDFs.

Min Zhou (JIRA) Tue, 28 Jul 2009 23:50:50 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736475#action_12736475
 ]


Min Zhou commented on HIVE-607:
-------------------------------

sorry, some typo

@Namit
I've implemented group_cat() in a rush, and found something difficult to slove:
1. function group_cat() has a internal order by clause, currently, we can't 
implement such an aggregation in hive.
2. when the strings will be group concated are too large, in another words, if 
data skew appears,  there is ofen not enough memory to store such a big result.

> Create statistical UDFs.
> ------------------------
>
>                 Key: HIVE-607
>                 URL: https://issues.apache.org/jira/browse/HIVE-607
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: S. Alex Smith
>            Assignee: Emil Ibrishimov
>            Priority: Minor
>             Fix For: 0.4.0
>
>         Attachments: HIVE-607.1.patch, UDAFStddev.java
>
>
> Create UDFs replicating:
> STD()         Return the population standard deviation
> STDDEV_POP()(v5.0.3)  Return the population standard deviation
> STDDEV_SAMP()(v5.0.3)         Return the sample standard deviation
> STDDEV()      Return the population standard deviation
> SUM()         Return the sum
> VAR_POP()(v5.0.3)     Return the population standard variance
> VAR_SAMP()(v5.0.3)    Return the sample variance
> VARIANCE()(v4.1)      Return the population standard variance
> as found at http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-607) Create statistical UDFs.

Reply via email to