[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939351#comment-15939351 ] Caleb Jones commented on HIVE-15211: Ah, sorry, I meant I hit this with Hive < 2.2.0. I have not yet tried with 2.2.0. > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939314#comment-15939314 ] Jesus Camacho Rodriguez commented on HIVE-15211: [~calebjones], have you tried with latest master? You should not hit that limitation anymore, as we should support any arbitrary condition. Please, let me know if you do and the error you hit. > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939303#comment-15939303 ] Caleb Jones commented on HIVE-15211: Will UDFs be supported in complex expressions in the ON clause? I hit this limitation when I had two tables I wanted to join on based on the intersection of array columns. {noformat} create table tbl_a ( val string, ids array ); create table tbl_b ( val string, ids array ); add jar hdfs:///brickhouse-0.7.1-SNAPSHOT.jar; select a.val, b.val from tbl_a as a join tbl_b as b on (size(intersect_array(a.ids, b.ids)) > 0); {noformat} > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754567#comment-15754567 ] Jesus Camacho Rodriguez commented on HIVE-15211: [~leftylev], I added some documentation for the join extensions in https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins. > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15689685#comment-15689685 ] Jesus Camacho Rodriguez commented on HIVE-15211: [~leftylev], I want to be done with HIVE-15251 before doing it, but it certainly should. Thanks > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.2.0 > > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15689682#comment-15689682 ] Lefty Leverenz commented on HIVE-15211: --- Should this be documented in the wiki? > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.2.0 > > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15684335#comment-15684335 ] Ashutosh Chauhan commented on HIVE-15211: - I noted catch-all condition for outer join checks. But, still I prefer to have checks at the point where it matters, since we know what exact limitations we have. However, since you are making all those restrictions go away, this wont matter anyway. +1 > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15683682#comment-15683682 ] Hive QA commented on HIVE-15211: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12839786/HIVE-15211.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10682 tests executed *Failed tests:* {noformat} TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=158) [scriptfile1.q,vector_outer_join5.q,file_with_header_footer.q,bucket4.q,input16_cc.q,bucket5.q,infer_bucket_sort_merge.q,constprog_partitioner.q,orc_merge2.q,reduce_deduplicate.q,schemeAuthority2.q,load_fs2.q,orc_merge8.q,orc_merge_incompat2.q,infer_bucket_sort_bucketed_table.q,vector_outer_join4.q,disable_merge_for_bucketing.q,vector_inner_join.q,orc_merge7.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=110) [vectorization_16.q,load_dyn_part5.q,join_casesensitive.q,transform_ppr2.q,join23.q,groupby7_map_skew.q,ppd_outer_join5.q,create_merge_compressed.q,louter_join_ppr.q,sample9.q,smb_mapjoin_16.q,vectorization_not.q,having.q,ppd_outer_join1.q,union_remove_12.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=96) [groupby_map_ppr.q,nullgroup4_multi_distinct.q,join_rc.q,union14.q,smb_mapjoin_12.q,vector_cast_constant.q,union_remove_4.q,auto_join11.q,load_dyn_part7.q,udaf_collect_set.q,vectorization_12.q,groupby_sort_skew_1.q,groupby_sort_skew_1_23.q,smb_mapjoin_25.q,skewjoinopt12.q] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=133) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=91) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[join45] (batchId=83) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2223/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2223/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2223/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12839786 - PreCommit-HIVE-Build > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15683494#comment-15683494 ] Jesus Camacho Rodriguez commented on HIVE-15211: [~ashutoshc], new patch addresses your comments and adds the new test case. > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15211.01.patch, HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15211) Provide support for complex expressions in ON clauses for INNER joins
[ https://issues.apache.org/jira/browse/HIVE-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668674#comment-15668674 ] Hive QA commented on HIVE-15211: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12839043/HIVE-15211.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10696 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] (batchId=133) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2131/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2131/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2131/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12839043 - PreCommit-HIVE-Build > Provide support for complex expressions in ON clauses for INNER joins > - > > Key: HIVE-15211 > URL: https://issues.apache.org/jira/browse/HIVE-15211 > Project: Hive > Issue Type: Bug > Components: CBO, Parser >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15211.patch > > > Currently, we have some restrictions on the predicates that we can use in ON > clauses for inner joins (we have those restrictions for outer joins too, but > we will tackle that in a follow-up). Semantically equivalent queries can be > expressed if the predicate is introduced in the WHERE clause, but we would > like that user can express it both in ON and WHERE clause, as in standard SQL. > This patch is an extension to overcome these restrictions for inner joins. > It will allow to write queries that currently fail in Hive such as: > {code:sql} > -- Disjunctions > SELECT * > FROM src1 JOIN src > ON (src1.key=src.key > OR src1.value between 100 and 102 > OR src.value between 100 and 102) > LIMIT 10; > -- Conjunction with multiple inputs references in one side > SELECT * > FROM src1 JOIN src > ON (src1.key+src.key >= 100 > AND src1.key+src.key <= 102) > LIMIT 10; > -- Conjunct with no references > SELECT * > FROM src1 JOIN src > ON (src1.value between 100 and 102 > AND src.value between 100 and 102 > AND true) > LIMIT 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)