[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980314#comment-15980314 ] Simanchal Das commented on HIVE-15229: -- Hi [~cwsteinbach] I have refreshed the patch with latest code. > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch, HIVE-15229.6.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch, HIVE-15229.6.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch, HIVE-15229.6.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.6.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch, HIVE-15229.6.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976999#comment-15976999 ] Simanchal Das commented on HIVE-15229: -- Hi [~cartershanklin] Thank your for adding all TD functionality related to 'ALL','ANY' and 'SOME' key words. Altogether it is big task to add every thing in change. Better we can add all TD functionalities incrementally. > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.5.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.4.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: (was: HIVE-15229.4.patch) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.4.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.3.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: (was: HIVE-15229.2.patch) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.2.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.2.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: (was: HIVE-15229.2.patch) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: (was: HIVE-15229.2.patch) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.2.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Description: In Teradata 'like any' and 'like all' operators are mostly used when we are matching a text field with numbers of patterns. 'like any' and 'like all' operator are equivalents of multiple like operator like example below. {noformat} --like any select col1 from table1 where col2 like any ('%accountant%', '%accounting%', '%retail%', '%bank%', '%insurance%'); --Can be written using multiple like condition select col1 from table1 where col2 like '%accountant%' or col2 like '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like '%insurance%' ; --like all select col1 from table1 where col2 like all ('%accountant%', '%accounting%', '%retail%', '%bank%', '%insurance%'); --Can be written using multiple like operator select col1 from table1 where col2 like '%accountant%' and col2 like '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like '%insurance%' ; {noformat} Problem statement: Now a days so many data warehouse projects are being migrated from Teradata to Hive. Always Data engineer and Business analyst are searching for these two operator. If we introduce these two operator in hive then so many scripts will be migrated smoothly instead of converting these operators to multiple like operators. Result: 1. 'LIKE ANY' operator return true if a text(column value) matches to any pattern. 2. 'LIKE ALL' operator return true if a text(column value) matches to all patterns. 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the left hand side is NULL, but also if one of the pattern in the list is NULL. was: In Teradata 'like any' and 'like all' operators are mostly used when we are matching a text field with numbers of patterns. 'like any' and 'like all' operator are equivalents of multiple like operator like example below. {noformat} --like any select col1 from table1 where col2 like any ('%accountant%', '%accounting%', '%retail%', '%bank%', '%insurance%'); --Can be written using multiple like condition select col1 from table1 where col2 like '%accountant%' or col2 like '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like '%insurance%' ; --like all select col1 from table1 where col2 like all ('%accountant%', '%accounting%', '%retail%', '%bank%', '%insurance%'); --Can be written using multiple like operator select col1 from table1 where col2 like '%accountant%' and col2 like '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like '%insurance%' ; {noformat} Problem statement: Now a days so many data warehouse projects are being migrated from Teradata to Hive. Always Data engineer and Business analyst are searching for these two operator. If we introduce these two operator in hive then so many scripts will be migrated smoothly instead of converting these operators to multiple like operators. > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operato
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.2.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Comment: was deleted (was: Thank you [~pxiong] for you comment. 'rlike' takes constant string as pattern, I think that will not work if we provide both constant and field value as pattern like below. {noformat} select col1 from table1 where col2 like any ('%abc%',col3); {noformat} ) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Open (was: Patch Available) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683481#comment-15683481 ] Simanchal Das commented on HIVE-15229: -- Hi [~vgarg] thanks for the review I have added the test cases for NOT LIKE ANY ,NOT LIKE ALL and null checks. According to your suggestion if any pattern have null value then zero results will come. Thanks. > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683471#comment-15683471 ] Simanchal Das commented on HIVE-15229: -- Thank you [~pxiong] for you comment. 'rlike' takes constant string as pattern, I think that will not work if we provide both constant and field value as pattern like below. {noformat} select col1 from table1 where col2 like any ('%abc%',col3); {noformat} > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683472#comment-15683472 ] Simanchal Das commented on HIVE-15229: -- Thank you [~pxiong] for you comment. 'rlike' takes constant string as pattern, I think that will not work if we provide both constant and field value as pattern like below. {noformat} select col1 from table1 where col2 like any ('%abc%',col3); {noformat} > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673630#comment-15673630 ] Simanchal Das commented on HIVE-15229: -- RB added https://reviews.apache.org/r/53845/ > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15673594#comment-15673594 ] Simanchal Das commented on HIVE-15229: -- Hi [~cwsteinbach], Could you please review this and provide your comments? Thanks, Simanchal > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Status: Patch Available (was: Open) > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-15229: - Attachment: HIVE-15229.1.patch > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Description: Problem Statement: When we are working with complex structure of data like avro. Most of the times we are encountering array contains multiple tuples and each tuple have struct schema. Suppose here struct schema is like below: {noformat} { "name": "employee", "type": [{ "type": "record", "name": "Employee", "namespace": "com.company.Employee", "fields": [{ "name": "empId", "type": "int" }, { "name": "empName", "type": "string" }, { "name": "age", "type": "int" }, { "name": "salary", "type": "double" }] }] } {noformat} Then while running our hive query complex array looks like array of employee objects. {noformat} Example: //(array>) Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] {noformat} When we are implementing business use cases day to day life we are encountering problems like sorting a tuple array by specific field[s] like empId,name,salary,etc by ASC or DESC order. Proposal: I have developed a udf 'sort_array_by' which will sort a tuple array by one or more fields in ASC or DESC order provided by user ,default is ascending order . {noformat} Example: 1.Select sort_array_by(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); output: array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] 2.Select sort_array_by(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] 3.Select sort_array_by(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] {noformat} was: Problem Statement: When we are working with complex structure of data like avro. Most of the times we are encountering array contains multiple tuples and each tuple have struct schema. Suppose here struct schema is like below: {noformat} { "name": "employee", "type": [{ "type": "record", "name": "Employee", "namespace": "com.company.Employee", "fields": [{ "name": "empId", "type": "int" }, { "name": "empName", "type": "string" }, { "name": "age", "type": "int" }, { "name": "salary", "type": "double" }] }] } {noformat} Then while running our hive query complex array looks like array of employee objects. {noformat} Example: //(array>) Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] {noformat} When we are implementing business use cases day to day life we are encountering problems like sorting a tuple array by specific field[s] like empId,name,salary,etc by ASC or DESC order. Proposal: I have developed a udf 'sort_array_by' which will sort a tuple array by one or more fields in ASC or DESC order provided by user ,default is ascending order . {noformat} Example: 1.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); output: array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] 2.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] 3.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,
[jira] [Commented] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15464097#comment-15464097 ] Simanchal Das commented on HIVE-14159: -- Hi [~cwsteinbach] I have attached a fresh copy of patch and some test cases failed which is not related to this patch. I have collected below six failed test info from logs. 1. testCliDriver[acid_bucket_pruning] 2. org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching 3. ERROR [de3b6d53-97c3-462f-81c2-00ff3fb2cf52 main] ql.Driver: FAILED: HiveAccessControlException Permission denied: Principal [name=user1, type=USER] does not have following privileges for operation DROPTABLE [[OBJECT OWNERSHIP] on Object [type=TABLE_OR_VIEW, name=default.src]] 4. org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAccessControlException: Permission denied: Principal [name=user1, type=USER] does not have following privileges for operation DROPTABLE [[OBJECT OWNERSHIP] on Object [type=TABLE_OR_VIEW, name=default.src]] 5. vector_join_part_col_char.q 6. explainuser_3.q.out > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch, HIVE-14159.4.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Patch Available (was: Open) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch, HIVE-14159.4.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Attachment: HIVE-14159.4.patch > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch, HIVE-14159.4.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Open (was: Patch Available) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Attachment: (was: HIVE-14159.4.patch) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370381#comment-15370381 ] Simanchal Das commented on HIVE-14159: -- Hi [~cwsteinbach], After fixing your review comments I have created a new patch. Thanks, Simanchal > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch, HIVE-14159.4.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Attachment: HIVE-14159.4.patch > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch, HIVE-14159.4.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Patch Available (was: Open) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch, HIVE-14159.4.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Open (was: Patch Available) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367614#comment-15367614 ] Simanchal Das commented on HIVE-14159: -- Hi Carl, Thank you for reviewing this ticket and RB. As per your suggestion I have fixed the review comments. I have added one more extra optional parameter for sorting order(ASC,DESC) in the code, that should be last parameter in UDF. If user does not provide any sorting order,then we do sorting in ascending order. Thanks, Simanchal > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Patch Available (was: Open) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Attachment: HIVE-14159.3.patch renamed the udf to sort_array_by > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Open (was: Patch Available) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Attachment: (was: HIVE-14159.3.patch) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Attachment: HIVE-14159.3.patch > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Patch Available (was: Open) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch, > HIVE-14159.3.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc by ASC or DESC order. > Proposal: > I have developed a udf 'sort_array_by' which will sort a tuple array by one > or more fields in ASC or DESC order provided by user ,default is ascending > order . > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Description: Problem Statement: When we are working with complex structure of data like avro. Most of the times we are encountering array contains multiple tuples and each tuple have struct schema. Suppose here struct schema is like below: {noformat} { "name": "employee", "type": [{ "type": "record", "name": "Employee", "namespace": "com.company.Employee", "fields": [{ "name": "empId", "type": "int" }, { "name": "empName", "type": "string" }, { "name": "age", "type": "int" }, { "name": "salary", "type": "double" }] }] } {noformat} Then while running our hive query complex array looks like array of employee objects. {noformat} Example: //(array>) Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] {noformat} When we are implementing business use cases day to day life we are encountering problems like sorting a tuple array by specific field[s] like empId,name,salary,etc by ASC or DESC order. Proposal: I have developed a udf 'sort_array_by' which will sort a tuple array by one or more fields in ASC or DESC order provided by user ,default is ascending order . {noformat} Example: 1.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); output: array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] 2.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] 3.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] {noformat} was: Problem Statement: When we are working with complex structure of data like avro. Most of the times we are encountering array contains multiple tuples and each tuple have struct schema. Suppose here struct schema is like below: {noformat} { "name": "employee", "type": [{ "type": "record", "name": "Employee", "namespace": "com.company.Employee", "fields": [{ "name": "empId", "type": "int" }, { "name": "empName", "type": "string" }, { "name": "age", "type": "int" }, { "name": "salary", "type": "double" }] }] } {noformat} Then while running our hive query complex array looks like array of employee objects. {noformat} Example: //(array>) Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] {noformat} When we are implementing business use cases day to day life we are encountering problems like sorting a tuple array by specific field[s] like empId,name,salary,etc by ASC or DESC order. Proposal: I have developed a udf 'sort_array_by' which will sort a tuple array by one or more fields in ASC or DESC order provided by user ,default is ascending order order. {noformat} Example: 1.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); output: array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] 2.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] 3.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Description: Problem Statement: When we are working with complex structure of data like avro. Most of the times we are encountering array contains multiple tuples and each tuple have struct schema. Suppose here struct schema is like below: {noformat} { "name": "employee", "type": [{ "type": "record", "name": "Employee", "namespace": "com.company.Employee", "fields": [{ "name": "empId", "type": "int" }, { "name": "empName", "type": "string" }, { "name": "age", "type": "int" }, { "name": "salary", "type": "double" }] }] } {noformat} Then while running our hive query complex array looks like array of employee objects. {noformat} Example: //(array>) Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] {noformat} When we are implementing business use cases day to day life we are encountering problems like sorting a tuple array by specific field[s] like empId,name,salary,etc by ASC or DESC order. Proposal: I have developed a udf 'sort_array_by' which will sort a tuple array by one or more fields in ASC or DESC order provided by user ,default is ascending order order. {noformat} Example: 1.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary","ASC"); output: array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] 2.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","ASC"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] 3.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age,"ASC"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] {noformat} was: Problem Statement: When we are working with complex structure of data like avro. Most of the times we are encountering array contains multiple tuples and each tuple have struct schema. Suppose here struct schema is like below: {noformat} { "name": "employee", "type": [{ "type": "record", "name": "Employee", "namespace": "com.company.Employee", "fields": [{ "name": "empId", "type": "int" }, { "name": "empName", "type": "string" }, { "name": "age", "type": "int" }, { "name": "salary", "type": "double" }] }] } {noformat} Then while running our hive query complex array looks like array of employee objects. {noformat} Example: //(array>) Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] {noformat} When we are implementing business use cases day to day life we are encountering problems like sorting a tuple array by specific field[s] like empId,name,salary,etc. Proposal: I have developed a udf 'sort_array_field' which will sort a tuple array by one or more fields in naural order. {noformat} Example: 1.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary"); output: array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] 2.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary"); output: array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] 3.Select sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age); output: array[struct(50
[jira] [Commented] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365620#comment-15365620 ] Simanchal Das commented on HIVE-14159: -- Hi [~ashutoshc] could you please review this and provide your comment. RB Link: https://reviews.apache.org/r/49619/ Thanks > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc. > Proposal: > I have developed a udf 'sort_array_field' which will sort a tuple array by > one or more fields in naural order. > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365616#comment-15365616 ] Simanchal Das commented on HIVE-14159: -- RB Link: https://reviews.apache.org/r/49619/ > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc. > Proposal: > I have developed a udf 'sort_array_field' which will sort a tuple array by > one or more fields in naural order. > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Patch Available (was: Open) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc. > Proposal: > I have developed a udf 'sort_array_field' which will sort a tuple array by > one or more fields in naural order. > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Attachment: HIVE-14159.2.patch > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc. > Proposal: > I have developed a udf 'sort_array_field' which will sort a tuple array by > one or more fields in naural order. > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Open (was: Patch Available) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch, HIVE-14159.2.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc. > Proposal: > I have developed a udf 'sort_array_field' which will sort a tuple array by > one or more fields in naural order. > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Status: Patch Available (was: Open) > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc. > Proposal: > I have developed a udf 'sort_array_field' which will sort a tuple array by > one or more fields in naural order. > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14159) sorting of tuple array using multiple field[s]
[ https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simanchal Das updated HIVE-14159: - Attachment: HIVE-14159.1.patch > sorting of tuple array using multiple field[s] > -- > > Key: HIVE-14159 > URL: https://issues.apache.org/jira/browse/HIVE-14159 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Simanchal Das >Assignee: Simanchal Das > Labels: patch > Attachments: HIVE-14159.1.patch > > > Problem Statement: > When we are working with complex structure of data like avro. > Most of the times we are encountering array contains multiple tuples and each > tuple have struct schema. > Suppose here struct schema is like below: > {noformat} > { > "name": "employee", > "type": [{ > "type": "record", > "name": "Employee", > "namespace": "com.company.Employee", > "fields": [{ > "name": "empId", > "type": "int" > }, { > "name": "empName", > "type": "string" > }, { > "name": "age", > "type": "int" > }, { > "name": "salary", > "type": "double" > }] > }] > } > {noformat} > Then while running our hive query complex array looks like array of employee > objects. > {noformat} > Example: > //(array>) > > Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)] > {noformat} > When we are implementing business use cases day to day life we are > encountering problems like sorting a tuple array by specific field[s] like > empId,name,salary,etc. > Proposal: > I have developed a udf 'sort_array_field' which will sort a tuple array by > one or more fields in naural order. > {noformat} > Example: > 1.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary"); > output: > array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)] > > 2.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary"); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > 3.Select > sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age); > output: > array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)