[ 
https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-14159:
----------------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0
           Status: Resolved  (was: Patch Available)

Committed to master:

{noformat}
% g log -1 --stat
commit 6e76ee3aef2210b2a1efa20d92ac997800cfcb75
Author: Carl Steinbach <cstei...@linkedin.com>
Date:   Wed Sep 7 11:28:35 2016 -0700

    HIVE-14159 : sorting of tuple array using multiple field[s] (Simanchal Das 
via Carl Steinbach)

 itests/src/test/resources/testconfiguration.properties                     |   
1 +
 itests/src/test/resources/testconfiguration.properties.orig                |   
8 +-
 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java           |   
1 +
 .../org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArrayByField.java  | 
202 ++++++++++++++++++
 .../apache/hadoop/hive/ql/udf/generic/TestGenericUDFSortArrayByField.java  | 
228 ++++++++++++++++++++
 ql/src/test/queries/clientnegative/udf_sort_array_by_wrong1.q              |   
2 +
 ql/src/test/queries/clientnegative/udf_sort_array_by_wrong2.q              |   
2 +
 ql/src/test/queries/clientnegative/udf_sort_array_by_wrong3.q              |  
16 ++
 ql/src/test/queries/clientpositive/udf_sort_array_by.q                     | 
136 ++++++++++++
 ql/src/test/results/beelinepositive/show_functions.q.out                   |   
1 +
 ql/src/test/results/clientnegative/udf_sort_array_by_wrong1.q.out          |   
1 +
 ql/src/test/results/clientnegative/udf_sort_array_by_wrong2.q.out          |   
1 +
 ql/src/test/results/clientnegative/udf_sort_array_by_wrong3.q.out          |  
37 ++++
 ql/src/test/results/clientpositive/show_functions.q.out                    |   
1 +
 ql/src/test/results/clientpositive/udf_sort_array_by.q.out                 | 
401 +++++++++++++++++++++++++++++++++++
 15 files changed, 1036 insertions(+), 2 deletions(-)
{noformat}

> sorting of tuple array using multiple field[s]
> ----------------------------------------------
>
>                 Key: HIVE-14159
>                 URL: https://issues.apache.org/jira/browse/HIVE-14159
>             Project: Hive
>          Issue Type: Improvement
>          Components: UDF
>            Reporter: Simanchal Das
>            Assignee: Simanchal Das
>              Labels: patch
>             Fix For: 2.2.0
>
>         Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, 
> HIVE-14159.3.patch, HIVE-14159.4.patch
>
>
> Problem Statement:
> When we are working with complex structure of data like avro.
> Most of the times we are encountering array contains multiple tuples and each 
> tuple have struct schema.
> Suppose here struct schema is like below:
> {noformat}
> {
>       "name": "employee",
>       "type": [{
>               "type": "record",
>               "name": "Employee",
>               "namespace": "com.company.Employee",
>               "fields": [{
>                       "name": "empId",
>                       "type": "int"
>               }, {
>                       "name": "empName",
>                       "type": "string"
>               }, {
>                       "name": "age",
>                       "type": "int"
>               }, {
>                       "name": "salary",
>                       "type": "double"
>               }]
>       }]
> }
> {noformat}
> Then while running our hive query complex array looks like array of employee 
> objects.
> {noformat}
> Example: 
>       //(array<struct<empId,empName,age,salary>>)
>       
> Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)]
> {noformat}
> When we are implementing business use cases day to day life we are 
> encountering problems like sorting a tuple array by specific field[s] like 
> empId,name,salary,etc by ASC or DESC order.
> Proposal:
> I have developed a udf 'sort_array_by' which will sort a tuple array by one 
> or more fields in ASC or DESC order provided by user ,default is ascending 
> order .
> {noformat}
> Example:
>       1.Select 
> sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC");
>       output: 
> array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)]
>       
>       2.Select 
> sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC");
>       output: 
> array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]
>       3.Select 
> sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC");
>       output: 
> array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to