[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-28 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-259:
--

   Resolution: Fixed
Fix Version/s: 0.6.0
 Release Note: Add PERCENTILE aggregate function
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed. Thanks for the hard work, Jerome Boulon and Zheng.

Btw, i manually fixed a show_function.q diff.  Please update the usage of 
percentile function on the wiki or somewhere.

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Fix For: 0.6.0
>
> Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
> HIVE-259.4.patch, HIVE-259.5.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-28 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-259:


Attachment: HIVE-259.5.patch

We take the method recommended by NIST.

See http://en.wikipedia.org/wiki/Percentile#Alternative_methods

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
> HIVE-259.4.patch, HIVE-259.5.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-26 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-259:


Attachment: HIVE-259.4.patch

This one fixes all checkstyle errors, and uses *Writable classes to avoid 
creating new objects as much as possible.


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
> HIVE-259.4.patch, HIVE-259.patch, jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-26 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Status: Open  (was: Patch Available)

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
> HIVE-259.patch, jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-26 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Status: Patch Available  (was: Open)

HIVE-259-3.patch

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
> HIVE-259.patch, jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-26 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Attachment: HIVE-259-3.patch

- use Double instead of Integer for percentile so we can ask for 99.999 
percentile 
- checkstyle fix except State object
- new test case


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259-3.patch, HIVE-259.1.patch, 
> HIVE-259.patch, jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-26 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Attachment: Percentile.xlsx

Percentiles that match included test case

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, 
> jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-26 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Attachment: (was: Percentile.xlsx)

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, 
> jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-23 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Status: Patch Available  (was: In Progress)

Percentile function.
Usage: select code,percentile(MyColumnB,"") from  group 
by ;

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, 
> jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-23 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Attachment: Percentile.xlsx
jb2.txt

Percentile test file + validation using Excep Percentile function:
CREATE TABLE JB2
(
duration bigint,
code string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' LINES TERMINATED BY '\n'
STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH '/jb2.txt' INTO TABLE JB2;



Result:
hive> select percentile(duration,"25,50,99") from JB2;
Ended Job = job_201002201654_0006
OK
[14.0,33.0,416.40001]
Time taken: 36.261 seconds

hive> select code,percentile(duration,"25,50,99") from JB2 group by code;
Ended Job = job_201002201654_0007
OK
a   [2.0,17.5,427.22999]
b   [22.75,44.5,345.849997]
c   [18.0,29.0,58.765]
Time taken: 23.419 seconds
hive> quit;


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch, 
> jb2.txt, Percentile.xlsx
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-23 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Attachment: HIVE-259-2.patch

Percentile function

> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259-2.patch, HIVE-259.1.patch, HIVE-259.patch
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-10 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-259:


Attachment: HIVE-259.1.patch

Jerome, I did a skeleton of the code to use HashMap. Do you want to start from 
there and add what is missing?


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259.1.patch, HIVE-259.patch
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-259) Add PERCENTILE aggregate function

2010-02-09 Thread Jerome Boulon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerome Boulon updated HIVE-259:
---

Attachment: HIVE-259.patch

First iteration for percentile (tested using Hive trunk and Hadoop 0.18.3):
usage:
CREATE TEMPORARY FUNCTION percentile AS 
'org.apache.hadoop.hive.ql.udf.Percentile';
select percentile(myColumn,"25,50,99") from MyTable;

- How can I share the state object cross functions?


> Add PERCENTILE aggregate function
> -
>
> Key: HIVE-259
> URL: https://issues.apache.org/jira/browse/HIVE-259
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Venky Iyer
>Assignee: Jerome Boulon
> Attachments: HIVE-259.patch
>
>
> Compute atleast 25, 50, 75th percentiles

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.