[jira] [Updated] (HIVE-17139) Conditional expressions optimization: skip the expression evaluation if the condition is not satisfied for vectorization engine.
[ https://issues.apache.org/jira/browse/HIVE-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Jia updated HIVE-17139: -- Attachment: HIVE-17139.8.patch > Conditional expressions optimization: skip the expression evaluation if the > condition is not satisfied for vectorization engine. > > > Key: HIVE-17139 > URL: https://issues.apache.org/jira/browse/HIVE-17139 > Project: Hive > Issue Type: Improvement >Reporter: Ke Jia >Assignee: Ke Jia > Attachments: HIVE-17139.1.patch, HIVE-17139.2.patch, > HIVE-17139.3.patch, HIVE-17139.4.patch, HIVE-17139.5.patch, > HIVE-17139.6.patch, HIVE-17139.7.patch, HIVE-17139.8.patch > > > The case when and if statement execution for Hive vectorization is not > optimal, which all the conditional and else expressions are evaluated for > current implementation. The optimized approach is to update the selected > array of batch parameter after the conditional expression is executed. Then > the else expression will only do the selected rows instead of all. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-17475) Disable mapjoin using hint
[ https://issues.apache.org/jira/browse/HIVE-17475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17475 started by Deepak Jaiswal. - > Disable mapjoin using hint > -- > > Key: HIVE-17475 > URL: https://issues.apache.org/jira/browse/HIVE-17475 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > Using hint disable mapjoin for a given query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17475) Disable mapjoin using hint
[ https://issues.apache.org/jira/browse/HIVE-17475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal reassigned HIVE-17475: - > Disable mapjoin using hint > -- > > Key: HIVE-17475 > URL: https://issues.apache.org/jira/browse/HIVE-17475 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > Using hint disable mapjoin for a given query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17456) Set current database for external LLAP interface
[ https://issues.apache.org/jira/browse/HIVE-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156525#comment-16156525 ] Hive QA commented on HIVE-17456: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885648/HIVE-17456.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11028 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6705/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6705/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6705/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12885648 - PreCommit-HIVE-Build > Set current database for external LLAP interface > > > Key: HIVE-17456 > URL: https://issues.apache.org/jira/browse/HIVE-17456 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17456.1.patch, HIVE-17456.2.patch > > > Currently the query passed in to external LLAP client has the default DB as > the current database. > Allow user to specify a different current database. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17474) Different logical plan of same query(TPC-DS/70) with same settings
[ https://issues.apache.org/jira/browse/HIVE-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated HIVE-17474: Summary: Different logical plan of same query(TPC-DS/70) with same settings (was: Different physical plan of same query(TPC-DS/70) on HOS) > Different logical plan of same query(TPC-DS/70) with same settings > -- > > Key: HIVE-17474 > URL: https://issues.apache.org/jira/browse/HIVE-17474 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel > > in > [DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql]. > On hive version(d3b88f6), i found that the physical plan is different in > runtime with the same settings. > sometimes the physical plan > {code} > TS[0]-FIL[63]-SEL[2]-RS[43]-JOIN[45]-RS[46]-JOIN[48]-SEL[49]-GBY[50]-RS[51]-GBY[52]-SEL[53]-RS[54]-SEL[55]-PTF[56]-SEL[57]-RS[59]-SEL[60]-LIM[61]-FS[62] > TS[3]-FIL[64]-SEL[5]-RS[44]-JOIN[45] > TS[6]-FIL[65]-SEL[8]-RS[39]-JOIN[41]-RS[47]-JOIN[48] > TS[9]-FIL[67]-SEL[11]-RS[18]-JOIN[20]-RS[21]-JOIN[23]-SEL[24]-GBY[25]-RS[26]-GBY[27]-RS[29]-SEL[30]-PTF[31]-FIL[66]-SEL[32]-GBY[38]-RS[40]-JOIN[41] > TS[12]-FIL[68]-SEL[14]-RS[19]-JOIN[20] > TS[15]-FIL[69]-SEL[17]-RS[22]-JOIN[23] > {code} > TS\[6\] connects with TS\[9\] on JOIN\[41\] and connects with TS\[0\] on > JOIN\[48\]. > sometimes > {code} > TS[0]-FIL[63]-RS[3]-JOIN[6]-RS[8]-JOIN[11]-RS[41]-JOIN[44]-SEL[46]-GBY[47]-RS[48]-GBY[49]-RS[50]-GBY[51]-RS[52]-SEL[53]-PTF[54]-SEL[55]-RS[57]-SEL[58]-LIM[59]-FS[60] > TS[1]-FIL[64]-RS[5]-JOIN[6] > TS[2]-FIL[65]-RS[10]-JOIN[11] > TS[12]-FIL[68]-RS[16]-JOIN[19]-RS[20]-JOIN[23]-FIL[67]-SEL[25]-GBY[26]-RS[27]-GBY[28]-RS[29]-GBY[30]-RS[31]-SEL[32]-PTF[33]-FIL[66]-SEL[34]-GBY[39]-RS[43]-JOIN[44] > TS[13]-FIL[69]-RS[18]-JOIN[19] > TS[14]-FIL[70]-RS[22]-JOIN[23] > {code} > TS\[2\] connects with TS\[0\] on JOIN\[11\] > Although TS\[2\] and TS\[6\] has different operator id, they are table store > in the query. > The difference causes different spark execution plan and different execution > time. I'm very confused why there are different physical plan with same > setting. Can anyone know where to investigate the root cause? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17474) Different logical plan of same query(TPC-DS/70) with same settings
[ https://issues.apache.org/jira/browse/HIVE-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated HIVE-17474: Description: in [DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql]. On hive version(d3b88f6), i found that the logical plan is different in runtime with the same settings. sometimes the logical plan {code} TS[0]-FIL[63]-SEL[2]-RS[43]-JOIN[45]-RS[46]-JOIN[48]-SEL[49]-GBY[50]-RS[51]-GBY[52]-SEL[53]-RS[54]-SEL[55]-PTF[56]-SEL[57]-RS[59]-SEL[60]-LIM[61]-FS[62] TS[3]-FIL[64]-SEL[5]-RS[44]-JOIN[45] TS[6]-FIL[65]-SEL[8]-RS[39]-JOIN[41]-RS[47]-JOIN[48] TS[9]-FIL[67]-SEL[11]-RS[18]-JOIN[20]-RS[21]-JOIN[23]-SEL[24]-GBY[25]-RS[26]-GBY[27]-RS[29]-SEL[30]-PTF[31]-FIL[66]-SEL[32]-GBY[38]-RS[40]-JOIN[41] TS[12]-FIL[68]-SEL[14]-RS[19]-JOIN[20] TS[15]-FIL[69]-SEL[17]-RS[22]-JOIN[23] {code} TS\[6\] connects with TS\[9\] on JOIN\[41\] and connects with TS\[0\] on JOIN\[48\]. sometimes {code} TS[0]-FIL[63]-RS[3]-JOIN[6]-RS[8]-JOIN[11]-RS[41]-JOIN[44]-SEL[46]-GBY[47]-RS[48]-GBY[49]-RS[50]-GBY[51]-RS[52]-SEL[53]-PTF[54]-SEL[55]-RS[57]-SEL[58]-LIM[59]-FS[60] TS[1]-FIL[64]-RS[5]-JOIN[6] TS[2]-FIL[65]-RS[10]-JOIN[11] TS[12]-FIL[68]-RS[16]-JOIN[19]-RS[20]-JOIN[23]-FIL[67]-SEL[25]-GBY[26]-RS[27]-GBY[28]-RS[29]-GBY[30]-RS[31]-SEL[32]-PTF[33]-FIL[66]-SEL[34]-GBY[39]-RS[43]-JOIN[44] TS[13]-FIL[69]-RS[18]-JOIN[19] TS[14]-FIL[70]-RS[22]-JOIN[23] {code} TS\[2\] connects with TS\[0\] on JOIN\[11\] Although TS\[2\] and TS\[6\] has different operator id, they are table store in the query. The difference causes different spark execution plan and different execution time. I'm very confused why there are different logical plan with same setting. Can anyone know where to investigate the root cause? was: in [DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql]. On hive version(d3b88f6), i found that the physical plan is different in runtime with the same settings. sometimes the physical plan {code} TS[0]-FIL[63]-SEL[2]-RS[43]-JOIN[45]-RS[46]-JOIN[48]-SEL[49]-GBY[50]-RS[51]-GBY[52]-SEL[53]-RS[54]-SEL[55]-PTF[56]-SEL[57]-RS[59]-SEL[60]-LIM[61]-FS[62] TS[3]-FIL[64]-SEL[5]-RS[44]-JOIN[45] TS[6]-FIL[65]-SEL[8]-RS[39]-JOIN[41]-RS[47]-JOIN[48] TS[9]-FIL[67]-SEL[11]-RS[18]-JOIN[20]-RS[21]-JOIN[23]-SEL[24]-GBY[25]-RS[26]-GBY[27]-RS[29]-SEL[30]-PTF[31]-FIL[66]-SEL[32]-GBY[38]-RS[40]-JOIN[41] TS[12]-FIL[68]-SEL[14]-RS[19]-JOIN[20] TS[15]-FIL[69]-SEL[17]-RS[22]-JOIN[23] {code} TS\[6\] connects with TS\[9\] on JOIN\[41\] and connects with TS\[0\] on JOIN\[48\]. sometimes {code} TS[0]-FIL[63]-RS[3]-JOIN[6]-RS[8]-JOIN[11]-RS[41]-JOIN[44]-SEL[46]-GBY[47]-RS[48]-GBY[49]-RS[50]-GBY[51]-RS[52]-SEL[53]-PTF[54]-SEL[55]-RS[57]-SEL[58]-LIM[59]-FS[60] TS[1]-FIL[64]-RS[5]-JOIN[6] TS[2]-FIL[65]-RS[10]-JOIN[11] TS[12]-FIL[68]-RS[16]-JOIN[19]-RS[20]-JOIN[23]-FIL[67]-SEL[25]-GBY[26]-RS[27]-GBY[28]-RS[29]-GBY[30]-RS[31]-SEL[32]-PTF[33]-FIL[66]-SEL[34]-GBY[39]-RS[43]-JOIN[44] TS[13]-FIL[69]-RS[18]-JOIN[19] TS[14]-FIL[70]-RS[22]-JOIN[23] {code} TS\[2\] connects with TS\[0\] on JOIN\[11\] Although TS\[2\] and TS\[6\] has different operator id, they are table store in the query. The difference causes different spark execution plan and different execution time. I'm very confused why there are different physical plan with same setting. Can anyone know where to investigate the root cause? > Different logical plan of same query(TPC-DS/70) with same settings > -- > > Key: HIVE-17474 > URL: https://issues.apache.org/jira/browse/HIVE-17474 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel > > in > [DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql]. > On hive version(d3b88f6), i found that the logical plan is different in > runtime with the same settings. > sometimes the logical plan > {code} > TS[0]-FIL[63]-SEL[2]-RS[43]-JOIN[45]-RS[46]-JOIN[48]-SEL[49]-GBY[50]-RS[51]-GBY[52]-SEL[53]-RS[54]-SEL[55]-PTF[56]-SEL[57]-RS[59]-SEL[60]-LIM[61]-FS[62] > TS[3]-FIL[64]-SEL[5]-RS[44]-JOIN[45] > TS[6]-FIL[65]-SEL[8]-RS[39]-JOIN[41]-RS[47]-JOIN[48] > TS[9]-FIL[67]-SEL[11]-RS[18]-JOIN[20]-RS[21]-JOIN[23]-SEL[24]-GBY[25]-RS[26]-GBY[27]-RS[29]-SEL[30]-PTF[31]-FIL[66]-SEL[32]-GBY[38]-RS[40]-JOIN[41] > TS[12]-FIL[68]-SEL[14]-RS[19]-JOIN[20] > TS[15]-FIL[69]-SEL[17]-RS[22]-JOIN[23] > {code} > TS\[6\] connects with TS\[9\] on JOIN\[41\] and connects with TS\[0\] on > JOIN\[48\]. > sometimes > {code} > TS[0]-FIL[63]-RS[3]-JOIN[6]-RS[8]-JOIN[11]-RS[41]-JOIN[44]-SEL[46]-GBY[47]-RS[48]-GBY[49]-RS[50]-GBY[51]-RS[52]-SEL[53]-PTF[54]-SEL[55]-RS[57]-SEL[58]-LIM[59]-FS[60] > TS[1]-FIL[64]-RS[5]-JOIN[6] > TS[2]-FIL[65]-RS[10]-JOIN[11] > TS[12]-FIL[68]-RS[16]-JOIN[19]-RS[20]-JOIN[23]-FIL[67
[jira] [Comment Edited] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156479#comment-16156479 ] Chaozhong Yang edited comment on HIVE-17460 at 9/7/17 5:14 AM: --- `CASCADE` works well for me, I will close this issue. Thanks [~mmccline] [~wzheng] was (Author: debugger87): `CASCADE` works for me, I will close this issue. Thanks [~mmccline] [~wzheng] > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaozhong Yang updated HIVE-17460: -- Resolution: Not A Problem Status: Resolved (was: Patch Available) > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156479#comment-16156479 ] Chaozhong Yang commented on HIVE-17460: --- `CASCADE` works for me, I will close this issue. Thanks [~mmccline] [~wzheng] > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156457#comment-16156457 ] Chaozhong Yang commented on HIVE-17460: --- [~wzheng] [~mmccline] Thanks for your suggestion, I will try CASCADE in DDL. If everything goes right, I will close this issue. > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156449#comment-16156449 ] Hive QA commented on HIVE-17468: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885683/HIVE-17468.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11028 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.druid.TestHiveDruidQueryBasedInputFormat.testTimeZone (batchId=247) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6704/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6704/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6704/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12885683 - PreCommit-HIVE-Build > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-17468.patch, hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156423#comment-16156423 ] Wei Zheng commented on HIVE-17460: -- [~debugger87] I discussed with Matt regarding this issue as he is the domain expert for schema evolution. He's saying you can achieve what you want by adding CASCADE in your DDL. > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17429) Hive JDBC doesn't return rows when querying Impala
[ https://issues.apache.org/jira/browse/HIVE-17429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156404#comment-16156404 ] Aihua Xu commented on HIVE-17429: - The tests don't look related to the change. +1. > Hive JDBC doesn't return rows when querying Impala > -- > > Key: HIVE-17429 > URL: https://issues.apache.org/jira/browse/HIVE-17429 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 2.1.0 >Reporter: Zach Amsden >Assignee: Zach Amsden > Fix For: 2.1.0 > > Attachments: HIVE-17429.1.patch, HIVE-17429.2.patch > > > The Hive JDBC driver used to return a result set when querying Impala. Now, > instead, it gets data back but interprets the data as query logs instead of a > resultSet. This causes many issues (we see complaints about beeline as well > as test failures). > This appears to be a regression introduced with asynchronous operation > against Hive. > Ideally, we could make both behaviors work. I have a simple patch that > should fix the problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156402#comment-16156402 ] Chaozhong Yang commented on HIVE-17460: --- [~mmccline] Maybe you are right. However, Spark SQL can do right things to meet our expectation. We added columns to original table and insert overwrite some existed partitions, Spark SQL fetch all column values, whereas, Hive does not. Is there any proper solutions? > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17466) Metastore API to list unique partition-key-value combinations
[ https://issues.apache.org/jira/browse/HIVE-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17466: Status: Patch Available (was: Open) > Metastore API to list unique partition-key-value combinations > - > > Key: HIVE-17466 > URL: https://issues.apache.org/jira/browse/HIVE-17466 > Project: Hive > Issue Type: New Feature > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-17466.1.patch > > > Raising this on behalf of [~thiruvel], who wrote this initially as part of a > tangential "data-discovery" system. > Programs like Apache Oozie, Apache Falcon (or Yahoo GDM), etc. launch > workflows based on the availability of table/partitions. Partitions are > currently discovered by listing partitions using (what boils down to) > {{HiveMetaStoreClient.listPartitions()}}. This can be slow and cumbersome, > given that {{Partition}} objects are heavyweight and carry redundant > information. The alternative is to use partition-names, which will need > client-side parsing to extract part-key values. > When checking which hourly partitions for a particular day have been > published already, it would be preferable to have an API that pushed down > part-key extraction into the {{RawStore}} layer, and returned key-values as > the result. This would be similar to how {{SELECT DISTINCT part_key FROM > my_table;}} would run, but at the {{HiveMetaStoreClient}} level. > Here's what we've been using at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17387) implement Tez AM registry in Hive
[ https://issues.apache.org/jira/browse/HIVE-17387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156383#comment-16156383 ] Hive QA commented on HIVE-17387: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885665/HIVE-17387.01.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 37 failed/errored test(s), 11028 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=234) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testChangeGuaranteedTotal (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testConcurrentUpdateWithError (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testConcurrentUpdates (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testConcurrentUpdatesBeforeMessage (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityDelayedAllocation (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityFallbackToNonLocal (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorDelayedAllocation (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedQueeTaskSelectionAfterScheduled (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForceLocalityTest1 (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityMultiplePreemptionsSameHost1 (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityMultiplePreemptionsSameHost2 (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityNotInDelayedQueue (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityPreemption (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityUnknownHost (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testGuaranteedScheduling (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testGuaranteedTransfer (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testHostPreferenceMissesConsistentPartialAlive (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testHostPreferenceMissesConsistentRollover (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testHostPreferenceUnknownAndNotSpecified (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testNoForceLocalityCounterTest1 (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testNoLocalityNotInDelayedQueue (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testNodeDisabled (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testNodeReEnabled (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testPreemption (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testPreemptionChoiceTimeOrdering (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testSimpleLocalAllocation (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testSimpleNoLocalityAllocation (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testUpdateOnFinishingTask (batchId=245) org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testUpdateWithError (batchId=245) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6702/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6702/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6702/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hi
[jira] [Updated] (HIVE-17466) Metastore API to list unique partition-key-value combinations
[ https://issues.apache.org/jira/browse/HIVE-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17466: Attachment: (was: HIVE-17466.1.patch) > Metastore API to list unique partition-key-value combinations > - > > Key: HIVE-17466 > URL: https://issues.apache.org/jira/browse/HIVE-17466 > Project: Hive > Issue Type: New Feature > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-17466.1.patch > > > Raising this on behalf of [~thiruvel], who wrote this initially as part of a > tangential "data-discovery" system. > Programs like Apache Oozie, Apache Falcon (or Yahoo GDM), etc. launch > workflows based on the availability of table/partitions. Partitions are > currently discovered by listing partitions using (what boils down to) > {{HiveMetaStoreClient.listPartitions()}}. This can be slow and cumbersome, > given that {{Partition}} objects are heavyweight and carry redundant > information. The alternative is to use partition-names, which will need > client-side parsing to extract part-key values. > When checking which hourly partitions for a particular day have been > published already, it would be preferable to have an API that pushed down > part-key extraction into the {{RawStore}} layer, and returned key-values as > the result. This would be similar to how {{SELECT DISTINCT part_key FROM > my_table;}} would run, but at the {{HiveMetaStoreClient}} level. > Here's what we've been using at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17466) Metastore API to list unique partition-key-value combinations
[ https://issues.apache.org/jira/browse/HIVE-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17466: Attachment: HIVE-17466.1.patch > Metastore API to list unique partition-key-value combinations > - > > Key: HIVE-17466 > URL: https://issues.apache.org/jira/browse/HIVE-17466 > Project: Hive > Issue Type: New Feature > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-17466.1.patch > > > Raising this on behalf of [~thiruvel], who wrote this initially as part of a > tangential "data-discovery" system. > Programs like Apache Oozie, Apache Falcon (or Yahoo GDM), etc. launch > workflows based on the availability of table/partitions. Partitions are > currently discovered by listing partitions using (what boils down to) > {{HiveMetaStoreClient.listPartitions()}}. This can be slow and cumbersome, > given that {{Partition}} objects are heavyweight and carry redundant > information. The alternative is to use partition-names, which will need > client-side parsing to extract part-key values. > When checking which hourly partitions for a particular day have been > published already, it would be preferable to have an API that pushed down > part-key extraction into the {{RawStore}} layer, and returned key-values as > the result. This would be similar to how {{SELECT DISTINCT part_key FROM > my_table;}} would run, but at the {{HiveMetaStoreClient}} level. > Here's what we've been using at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17466) Metastore API to list unique partition-key-value combinations
[ https://issues.apache.org/jira/browse/HIVE-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17466: Status: Open (was: Patch Available) > Metastore API to list unique partition-key-value combinations > - > > Key: HIVE-17466 > URL: https://issues.apache.org/jira/browse/HIVE-17466 > Project: Hive > Issue Type: New Feature > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-17466.1.patch > > > Raising this on behalf of [~thiruvel], who wrote this initially as part of a > tangential "data-discovery" system. > Programs like Apache Oozie, Apache Falcon (or Yahoo GDM), etc. launch > workflows based on the availability of table/partitions. Partitions are > currently discovered by listing partitions using (what boils down to) > {{HiveMetaStoreClient.listPartitions()}}. This can be slow and cumbersome, > given that {{Partition}} objects are heavyweight and carry redundant > information. The alternative is to use partition-names, which will need > client-side parsing to extract part-key values. > When checking which hourly partitions for a particular day have been > published already, it would be preferable to have an API that pushed down > part-key extraction into the {{RawStore}} layer, and returned key-values as > the result. This would be similar to how {{SELECT DISTINCT part_key FROM > my_table;}} would run, but at the {{HiveMetaStoreClient}} level. > Here's what we've been using at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17473) Hive WM: implement workload management pools
[ https://issues.apache.org/jira/browse/HIVE-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156345#comment-16156345 ] Sergey Shelukhin edited comment on HIVE-17473 at 9/7/17 2:37 AM: - On top of the low level WM patch. Needs tests, and also to use real workload management schema instead of dummy classes eventually. cc [~prasanth_j] for reference this is what I'm using for now. WorkloadManager has some dummy classes I'm using where needed. was (Author: sershe): On top of the low level WM patch. > Hive WM: implement workload management pools > > > Key: HIVE-17473 > URL: https://issues.apache.org/jira/browse/HIVE-17473 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17473.WIP.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17473) Hive WM: implement workload management pools
[ https://issues.apache.org/jira/browse/HIVE-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156346#comment-16156346 ] Sergey Shelukhin commented on HIVE-17473: - Hmm, I realized we still don't have an umbrella and one pager. I was intending to create them after HIVE-17386 patch. Will do tomorrow. > Hive WM: implement workload management pools > > > Key: HIVE-17473 > URL: https://issues.apache.org/jira/browse/HIVE-17473 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17473.WIP.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17473) Hive WM: implement workload management pools
[ https://issues.apache.org/jira/browse/HIVE-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17473: Attachment: HIVE-17473.WIP.patch On top of the low level WM patch. > Hive WM: implement workload management pools > > > Key: HIVE-17473 > URL: https://issues.apache.org/jira/browse/HIVE-17473 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17473.WIP.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17473) Hive WM: implement workload management pools
[ https://issues.apache.org/jira/browse/HIVE-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17473: --- > Hive WM: implement workload management pools > > > Key: HIVE-17473 > URL: https://issues.apache.org/jira/browse/HIVE-17473 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156319#comment-16156319 ] Sergey Shelukhin commented on HIVE-17468: - Hmm.. wouldn't this in turn break the calcite dependency, especially since it's using the newer version? > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-17468.patch, hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17466) Metastore API to list unique partition-key-value combinations
[ https://issues.apache.org/jira/browse/HIVE-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156318#comment-16156318 ] Hive QA commented on HIVE-17466: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885664/HIVE-17466.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6701/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6701/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6701/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-09-07 02:00:42.308 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-6701/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-09-07 02:00:42.311 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 849fa02 HIVE-17455: External LLAP client: connection to HS2 should be kept open until explicitly closed (Jason Dere, reviewed by Sergey Shelukhin) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 849fa02 HIVE-17455: External LLAP client: connection to HS2 should be kept open until explicitly closed (Jason Dere, reviewed by Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-09-07 02:00:47.758 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p1 patching file metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java patching file metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java patching file metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java patching file standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp patching file standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h patching file standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp patching file standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp patching file standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h patching file standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AbortTxnsRequest.java patching file standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/AddDynamicPartitions.java patching file standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ClearFileMetadataRequest.java patching file standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ClientCapabilities.java patching file standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CompactionRequest.java patching file standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FireEventRequest.java patching file standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Function.java patching file standalone-metastore/src/gen/thrift/gen-javabe
[jira] [Commented] (HIVE-17459) View deletion operation failed to replicate on target cluster
[ https://issues.apache.org/jira/browse/HIVE-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156316#comment-16156316 ] Hive QA commented on HIVE-17459: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885660/HIVE-17459.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11028 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[drop_table_failure2] (batchId=89) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6700/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6700/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6700/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12885660 - PreCommit-HIVE-Build > View deletion operation failed to replicate on target cluster > - > > Key: HIVE-17459 > URL: https://issues.apache.org/jira/browse/HIVE-17459 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-17459.1.patch > > > View dropping is not replicated during incremental repl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17414) HoS DPP + Vectorization generates invalid explain plan due to CombineEquivalentWorkResolver
[ https://issues.apache.org/jira/browse/HIVE-17414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156308#comment-16156308 ] liyunzhang_intel commented on HIVE-17414: - thanks for [~lirui] and [~stakiar]'s review > HoS DPP + Vectorization generates invalid explain plan due to > CombineEquivalentWorkResolver > --- > > Key: HIVE-17414 > URL: https://issues.apache.org/jira/browse/HIVE-17414 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Sahil Takiar >Assignee: liyunzhang_intel > Fix For: 3.0.0 > > Attachments: HIVE-17414.1.patch, HIVE-17414.2.patch, > HIVE-17414.3.patch, HIVE-17414.4.patch, HIVE-17414.5.patch, HIVE-17414.patch > > > Similar to HIVE-16948, the following query generates an invalid explain plan > when HoS DPP is enabled + vectorization: > {code:sql} > select ds from (select distinct(ds) as ds from srcpart union all select > distinct(ds) as ds from srcpart) s where s.ds in (select max(srcpart.ds) from > srcpart union all select min(srcpart.ds) from srcpart) > {code} > Explain Plan: > {code} > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > Edges: > Reducer 11 <- Map 10 (GROUP, 1) > Reducer 13 <- Map 12 (GROUP, 1) > A masked pattern was here > Vertices: > Map 10 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 2000 Data size: 21248 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 2000 Data size: 21248 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > aggregations: max(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Execution mode: vectorized > Map 12 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 2000 Data size: 21248 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 2000 Data size: 21248 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > aggregations: min(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Execution mode: vectorized > Reducer 11 > Execution mode: vectorized > Reduce Operator Tree: > Group By Operator > aggregations: max(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: _col0 is not null (type: boolean) > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColu
[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
[ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156286#comment-16156286 ] Junjie Chen commented on HIVE-17261: Thanks [~Ferd] As for 4, since jobconf is a member variable, so it doesn't need to explicit transfer. > Hive use deprecated ParquetInputSplit constructor which blocked parquet > dictionary filter > - > > Key: HIVE-17261 > URL: https://issues.apache.org/jira/browse/HIVE-17261 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Junjie Chen >Assignee: Junjie Chen > Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, > HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.diff, HIVE-17261.patch > > > Hive use deprecated ParquetInputSplit in > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128] > Please see interface definition in > [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80] > Old interface set rowgroupoffset values which will lead to skip dictionary > filter in parquet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
[ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated HIVE-17261: --- Attachment: HIVE-17261.5.patch > Hive use deprecated ParquetInputSplit constructor which blocked parquet > dictionary filter > - > > Key: HIVE-17261 > URL: https://issues.apache.org/jira/browse/HIVE-17261 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Junjie Chen >Assignee: Junjie Chen > Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, > HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.diff, HIVE-17261.patch > > > Hive use deprecated ParquetInputSplit in > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128] > Please see interface definition in > [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80] > Old interface set rowgroupoffset values which will lead to skip dictionary > filter in parquet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17472) Drop-partition for multi-level partition fails, if data does not exist.
[ https://issues.apache.org/jira/browse/HIVE-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17472: Status: Patch Available (was: Open) > Drop-partition for multi-level partition fails, if data does not exist. > --- > > Key: HIVE-17472 > URL: https://issues.apache.org/jira/browse/HIVE-17472 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17472.1.patch, HIVE-17472.2.patch > > > Raising this on behalf of [~cdrome] and [~selinazh]. > Here's how to reproduce the problem: > {code:sql} > CREATE TABLE foobar ( foo STRING, bar STRING ) PARTITIONED BY ( dt STRING, > region STRING ) STORED AS RCFILE LOCATION '/tmp/foobar'; > ALTER TABLE foobar ADD PARTITION ( dt='1', region='A' ) ; > dfs -rm -R -skipTrash /tmp/foobar/dt=1; > ALTER TABLE foobar DROP PARTITION ( dt='1' ); > {code} > This causes a client-side error as follows: > {code} > 15/02/26 23:08:32 ERROR exec.DDLTask: > org.apache.hadoop.hive.ql.metadata.HiveException: Unknown error. Please check > logs. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17472) Drop-partition for multi-level partition fails, if data does not exist.
[ https://issues.apache.org/jira/browse/HIVE-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17472: Attachment: HIVE-17472.2.patch And now, with tests. > Drop-partition for multi-level partition fails, if data does not exist. > --- > > Key: HIVE-17472 > URL: https://issues.apache.org/jira/browse/HIVE-17472 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17472.1.patch, HIVE-17472.2.patch > > > Raising this on behalf of [~cdrome] and [~selinazh]. > Here's how to reproduce the problem: > {code:sql} > CREATE TABLE foobar ( foo STRING, bar STRING ) PARTITIONED BY ( dt STRING, > region STRING ) STORED AS RCFILE LOCATION '/tmp/foobar'; > ALTER TABLE foobar ADD PARTITION ( dt='1', region='A' ) ; > dfs -rm -R -skipTrash /tmp/foobar/dt=1; > ALTER TABLE foobar DROP PARTITION ( dt='1' ); > {code} > This causes a client-side error as follows: > {code} > 15/02/26 23:08:32 ERROR exec.DDLTask: > org.apache.hadoop.hive.ql.metadata.HiveException: Unknown error. Please check > logs. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17472) Drop-partition for multi-level partition fails, if data does not exist.
[ https://issues.apache.org/jira/browse/HIVE-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17472: Status: Open (was: Patch Available) > Drop-partition for multi-level partition fails, if data does not exist. > --- > > Key: HIVE-17472 > URL: https://issues.apache.org/jira/browse/HIVE-17472 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17472.1.patch > > > Raising this on behalf of [~cdrome] and [~selinazh]. > Here's how to reproduce the problem: > {code:sql} > CREATE TABLE foobar ( foo STRING, bar STRING ) PARTITIONED BY ( dt STRING, > region STRING ) STORED AS RCFILE LOCATION '/tmp/foobar'; > ALTER TABLE foobar ADD PARTITION ( dt='1', region='A' ) ; > dfs -rm -R -skipTrash /tmp/foobar/dt=1; > ALTER TABLE foobar DROP PARTITION ( dt='1' ); > {code} > This causes a client-side error as follows: > {code} > 15/02/26 23:08:32 ERROR exec.DDLTask: > org.apache.hadoop.hive.ql.metadata.HiveException: Unknown error. Please check > logs. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17152) Improve security of random generator for HS2 cookies
[ https://issues.apache.org/jira/browse/HIVE-17152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156259#comment-16156259 ] Thejas M Nair commented on HIVE-17152: -- +1 > Improve security of random generator for HS2 cookies > > > Key: HIVE-17152 > URL: https://issues.apache.org/jira/browse/HIVE-17152 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-17152.1.patch > > > The random number generated is used as a secret to append to a sequence and > SHA to implement a CookieSigner. If this is attackable, then it's possible > for an attacker to sign a cookie as if we had. We should fix this and use > SecureRandom as a stronger random function . > HTTPAuthUtils has a similar issue. If that is attackable, an attacker might > be able to create a similar cookie. Paired with the above issue with the > CookieSigner, it could reasonably spoof a HS2 cookie. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17450) rename TestTxnCommandsBase
[ https://issues.apache.org/jira/browse/HIVE-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156249#comment-16156249 ] Hive QA commented on HIVE-17450: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885655/HIVE-17450.02.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11014 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=234) org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver (batchId=103) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6699/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6699/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6699/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12885655 - PreCommit-HIVE-Build > rename TestTxnCommandsBase > --- > > Key: HIVE-17450 > URL: https://issues.apache.org/jira/browse/HIVE-17450 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Peter Vary > Attachments: HIVE-17450.02.patch, HIVE-17450.patch > > > TestTxnCommandsBase is an abstract class, added in HIVE-17205; it matches the > maven test pattern...because of that there is a failining test in every test > output -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17448) ArrayIndexOutOfBoundsException on ORC tables after adding a struct field
[ https://issues.apache.org/jira/browse/HIVE-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Sokolov updated HIVE-17448: --- Description: When ORC files have been created with older schema, which had smaller set of struct fields, and schema have been changed to one with more struct fields, and there are sibling fields of struct type going after struct itself, ArrayIndexOutOfBoundsException is being thrown. Steps to reproduce: {code:none} create external table test_broken_struct(a struct, b int) stored as orc; insert into table test_broken_struct select named_struct("f1", 1, "f2", 2), 3; drop table test_broken_struct; create external table test_broken_struct(a struct, b int) stored as orc; select * from test_broken_struct; {code} Same scenario is not causing crash on hive 0.14. Debug log and stack trace: {code:none} 2017-09-07T00:21:40,266 INFO [main] orc.OrcInputFormat: Using schema evolution configuration variables schema.evol ution.columns [a, b] / schema.evolution.columns.types [struct, int] (isAcidRead false) 2017-09-07T00:21:40,267 DEBUG [main] orc.OrcInputFormat: No ORC pushdown predicate 2017-09-07T00:21:40,267 INFO [main] orc.ReaderImpl: Reading ORC rows from hdfs://cluster-7199-m/user/hive/warehous e/test_broken_struct/00_0 with {include: [true, true, true, true, true], offset: 3, length: 159, schema: struct ,b:int>} Failed with exception java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 5 2017-09-07T00:21:40,273 ERROR [main] CliDriver: Failed with exception java.io.IOException:java.lang.ArrayIndexOutOf BoundsException: 5 java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 5 at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2098) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.ArrayIndexOutOfBoundsException: 5 at org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:195) at org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:253) at org.apache.orc.impl.SchemaEvolution.(SchemaEvolution.java:59) at org.apache.orc.impl.RecordReaderImpl.(RecordReaderImpl.java:149) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:63) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:87) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:314) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.(OrcInputFormat.java:225) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1691) at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:69 5) at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:333) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459) ... 15 more {code} was: When ORC files have been created with older schema, which had smaller set of struct fields, and schema have been changed to one with more struct fields, and there are sibling fields of struct type going after struct itself, ArrayIndexOutOfBoundsException is being thrown. Steps to reproduce: {code:none} create external table test_broken_struct(a struct, b int); insert into table test_broken_struct select named_struct("f1", 1, "f2", 2), 3; drop table test_broken_struct; create external table test_broken_struct(a struct, b int); select * from test_broken_struct; {code} Same scenario is not causing crash on hive 0.14. > ArrayIndexOutOfBoundsException on ORC tables after adding a struct field > --
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156233#comment-16156233 ] Matt McCline commented on HIVE-17460: - I don't think this is right -- you will end up with upset customers because query results will be different. Unfortunately, the current semantics of adding a column are that the default behavior is RESTRICT not CASCADE. RESTRICT means the partition schema's do not get updated with the new columns. Thus, the new columns default to NULL when queried. In order to get the behavior you are talking about you would need to specify the CASCADE option. So I'm a -1 on this change. [~wzheng] > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17421) Clear incorrect stats after replication
[ https://issues.apache.org/jira/browse/HIVE-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156231#comment-16156231 ] Daniel Dai commented on HIVE-17421: --- [~anishek], can you review? > Clear incorrect stats after replication > --- > > Key: HIVE-17421 > URL: https://issues.apache.org/jira/browse/HIVE-17421 > Project: Hive > Issue Type: Bug > Components: repl >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-17421.1.patch, HIVE-17421.2.patch > > > After replication, some stats summary are incorrect. If > hive.compute.query.using.stats set to true, we will get wrong result on the > destination side. > This will not happen with bootstrap replication. This is because stats > summary are in table properties and will be replicated to the destination. > However, in incremental replication, this won't work. When creating table, > the stats summary are empty (eg, numRows=0). Later when we insert data, stats > summary are updated with > update_table_column_statistics/update_partition_column_statistics, however, > both events are not captured in incremental replication. Thus on the > destination side, we will get count\(*\)=0. The simple solution is to remove > COLUMN_STATS_ACCURATE property after incremental replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156228#comment-16156228 ] slim bouguerra commented on HIVE-17468: --- It was pulling random stuff from transitive dependencies bq. slim bouguerra, before we pulled the calcite-druid package, were we packing then the hive jackson dependencies into the uber jar? How was that ever working > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-17468.patch, hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17472) Drop-partition for multi-level partition fails, if data does not exist.
[ https://issues.apache.org/jira/browse/HIVE-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17472: Affects Version/s: 3.0.0 2.2.0 Status: Patch Available (was: Open) Submitting for tests. > Drop-partition for multi-level partition fails, if data does not exist. > --- > > Key: HIVE-17472 > URL: https://issues.apache.org/jira/browse/HIVE-17472 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17472.1.patch > > > Raising this on behalf of [~cdrome] and [~selinazh]. > Here's how to reproduce the problem: > {code:sql} > CREATE TABLE foobar ( foo STRING, bar STRING ) PARTITIONED BY ( dt STRING, > region STRING ) STORED AS RCFILE LOCATION '/tmp/foobar'; > ALTER TABLE foobar ADD PARTITION ( dt='1', region='A' ) ; > dfs -rm -R -skipTrash /tmp/foobar/dt=1; > ALTER TABLE foobar DROP PARTITION ( dt='1' ); > {code} > This causes a client-side error as follows: > {code} > 15/02/26 23:08:32 ERROR exec.DDLTask: > org.apache.hadoop.hive.ql.metadata.HiveException: Unknown error. Please check > logs. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17472) Drop-partition for multi-level partition fails, if data does not exist.
[ https://issues.apache.org/jira/browse/HIVE-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17472: Attachment: HIVE-17472.1.patch > Drop-partition for multi-level partition fails, if data does not exist. > --- > > Key: HIVE-17472 > URL: https://issues.apache.org/jira/browse/HIVE-17472 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17472.1.patch > > > Raising this on behalf of [~cdrome] and [~selinazh]. > Here's how to reproduce the problem: > {code:sql} > CREATE TABLE foobar ( foo STRING, bar STRING ) PARTITIONED BY ( dt STRING, > region STRING ) STORED AS RCFILE LOCATION '/tmp/foobar'; > ALTER TABLE foobar ADD PARTITION ( dt='1', region='A' ) ; > dfs -rm -R -skipTrash /tmp/foobar/dt=1; > ALTER TABLE foobar DROP PARTITION ( dt='1' ); > {code} > This causes a client-side error as follows: > {code} > 15/02/26 23:08:32 ERROR exec.DDLTask: > org.apache.hadoop.hive.ql.metadata.HiveException: Unknown error. Please check > logs. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17472) Drop-partition for multi-level partition fails, if data does not exist.
[ https://issues.apache.org/jira/browse/HIVE-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan reassigned HIVE-17472: --- > Drop-partition for multi-level partition fails, if data does not exist. > --- > > Key: HIVE-17472 > URL: https://issues.apache.org/jira/browse/HIVE-17472 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > > Raising this on behalf of [~cdrome] and [~selinazh]. > Here's how to reproduce the problem: > {code:sql} > CREATE TABLE foobar ( foo STRING, bar STRING ) PARTITIONED BY ( dt STRING, > region STRING ) STORED AS RCFILE LOCATION '/tmp/foobar'; > ALTER TABLE foobar ADD PARTITION ( dt='1', region='A' ) ; > dfs -rm -R -skipTrash /tmp/foobar/dt=1; > ALTER TABLE foobar DROP PARTITION ( dt='1' ); > {code} > This causes a client-side error as follows: > {code} > 15/02/26 23:08:32 ERROR exec.DDLTask: > org.apache.hadoop.hive.ql.metadata.HiveException: Unknown error. Please check > logs. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156184#comment-16156184 ] Hive QA commented on HIVE-17460: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885647/HIVE-17460.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11029 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_cascade] (batchId=84) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_5] (batchId=40) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat18] (batchId=69) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_complex] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_primitive] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_complex] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_primitive] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_complex] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_primitive] (batchId=158) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning (batchId=291) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalXY (batchId=183) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteTimestamp (batchId=183) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6698/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6698/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6698/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12885647 - PreCommit-HIVE-Build > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default
[ https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156174#comment-16156174 ] Sergey Shelukhin commented on HIVE-17471: - [~teddy.choi] should this setting be turned on? > Vectorization: Enable hive.vectorized.row.identifier.enabled to true by > default > --- > > Key: HIVE-17471 > URL: https://issues.apache.org/jira/browse/HIVE-17471 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Teddy Choi > > We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 > "Vectorization: Add infrastructure for vectorization of ROW__ID struct" > But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default
[ https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156173#comment-16156173 ] Sergey Shelukhin commented on HIVE-17471: - hmm > Vectorization: Enable hive.vectorized.row.identifier.enabled to true by > default > --- > > Key: HIVE-17471 > URL: https://issues.apache.org/jira/browse/HIVE-17471 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Teddy Choi > > We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 > "Vectorization: Add infrastructure for vectorization of ROW__ID struct" > But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156170#comment-16156170 ] Jesus Camacho Rodriguez edited comment on HIVE-17468 at 9/6/17 11:04 PM: - {quote} if we exclude all the druid stuff then what is left is not what we want is going to be either hive jackson ones or from other transitive dependency. {quote} [~bslim], before we pulled the calcite-druid package, were we packing then the hive jackson dependencies into the uber jar? How was that ever working? was (Author: jcamachorodriguez): {quote} if we exclude all the druid stuff then what is left is not what we want is going to be either hive jackson ones or from other transitive dependency. {quote} [~bslim], before we pulled the calcite-druid package, were we packing then the hive jackson dependencies into the uber jar? How was that working? > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-17468.patch, hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default
[ https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-17471: --- > Vectorization: Enable hive.vectorized.row.identifier.enabled to true by > default > --- > > Key: HIVE-17471 > URL: https://issues.apache.org/jira/browse/HIVE-17471 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Teddy Choi > > We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 > "Vectorization: Add infrastructure for vectorization of ROW__ID struct" > But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156170#comment-16156170 ] Jesus Camacho Rodriguez commented on HIVE-17468: {quote} if we exclude all the druid stuff then what is left is not what we want is going to be either hive jackson ones or from other transitive dependency. {quote} [~bslim], before we pulled the calcite-druid package, were we packing then the hive jackson dependencies into the uber jar? How was that working? > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-17468.patch, hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17382) Change startsWith relation introduced in HIVE-17316
[ https://issues.apache.org/jira/browse/HIVE-17382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156167#comment-16156167 ] Lefty Leverenz commented on HIVE-17382: --- No-doc note: This renames two configuration parameters that were created by HIVE-16146. They are for internal use only, so no documentation is needed. * hive.in.test.short.logs -> hive.testing.short.logs * hive.in.test.remove.logs -> hive.testing.remove.logs > Change startsWith relation introduced in HIVE-17316 > --- > > Key: HIVE-17382 > URL: https://issues.apache.org/jira/browse/HIVE-17382 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Fix For: 3.0.0 > > Attachments: HIVE-17382.01.patch, HIVE-17382.02.patch, > HIVE-17382.03.patch, HIVE-17382.04.patch > > > In HiveConf the new name should be checked if it starts with a > restricted/hidden variable prefix and not vice-versa. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16146) If possible find a better way to filter the TestBeeLineDriver output
[ https://issues.apache.org/jira/browse/HIVE-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156168#comment-16156168 ] Lefty Leverenz commented on HIVE-16146: --- HIVE-17382 renames the configs (no doc needed, internal use only): * hive.in.test.short.logs -> hive.testing.short.logs * hive.in.test.remove.logs -> hive.testing.remove.logs > If possible find a better way to filter the TestBeeLineDriver output > > > Key: HIVE-16146 > URL: https://issues.apache.org/jira/browse/HIVE-16146 > Project: Hive > Issue Type: Improvement > Components: Testing Infrastructure >Affects Versions: 2.2.0 >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-16146.02.patch, HIVE-16146.03.patch, > HIVE-16146.04.patch, HIVE-16146.05.patch, HIVE-16146.06.patch, > HIVE-16146.patch > > > Currently we apply a blacklist to filter the output of the BeeLine Qtest runs. > It might be a good idea to go thorough of the possibilities and find a better > way, if possible. > I think our main goal could be for the TestBeeLineDriver test output to match > the TestCliDriver output of the came query file. Or if it is not possible, > then at least a similar one > CC: [~vihangk1] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17366) Constraint replication in bootstrap
[ https://issues.apache.org/jira/browse/HIVE-17366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-17366: -- Attachment: HIVE-17366.2.patch Addressing Sankar's review comments. > Constraint replication in bootstrap > --- > > Key: HIVE-17366 > URL: https://issues.apache.org/jira/browse/HIVE-17366 > Project: Hive > Issue Type: New Feature > Components: repl >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-17366.1.patch, HIVE-17366.2.patch > > > Incremental constraint replication is tracked in HIVE-15705. This is to track > the bootstrap replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-13989) Extended ACLs are not handled according to specification
[ https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156150#comment-16156150 ] Chris Drome commented on HIVE-13989: Ran tests local before and after the patch on branch-2 and none of the failures appear to be attributable to the patch: || Test || branch-2 HEAD (b3a6e52) || branch-2 HEAD + HIVE-13989 || | org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] | PASSED | PASSED | | org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explaindenpendencydiffengs] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] | PASSED | PASSED | | org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure | FAILED | FAILED | | org.apache.hive.jdbc.TestJdbcDriver2.testYarnATSGuid | PASSED | PASSED | > Extended ACLs are not handled according to specification > > > Key: HIVE-13989 > URL: https://issues.apache.org/jira/browse/HIVE-13989 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 1.2.1, 2.0.0 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13989.1-branch-1.patch, HIVE-13989.1.patch, > HIVE-13989.4-branch-2.2.patch, HIVE-13989.4-branch-2.patch, > HIVE-13989-branch-1.patch, HIVE-13989-branch-2.2.patch, > HIVE-13989-branch-2.2.patch, HIVE-13989-branch-2.2.patch > > > Hive takes two approaches to working with extended ACLs depending on whether > data is being produced via a Hive query or HCatalog APIs. A Hive query will > run an FsShell command to recursively set the extended ACLs for a directory > sub-tree. HCatalog APIs will attempt to build up the directory sub-tree > programmatically and runs some code to set the ACLs to match the parent > directory. > Some incorrect assumptions were made when implementing the extended ACLs > support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the > design documents of extended ACLs in HDFS. These documents model the > implementation after the POSIX implementation on Linux, which can be found at > http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html. > The code for setting extended ACLs via HCatalog APIs is found in > HdfsUtils.java: > {code} > if (aclEnabled) { > aclStatus = sourceStatus.getAclStatus(); > if (aclStatus != null) { > LOG.trace(aclStatus.toString()); > aclEntries = aclStatus.getEntries(); > removeBaseAclEntries(aclEntries); > //the ACL api's also expect the tradition user/group/other permission > in the form of ACL > aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, > sourcePerm.getUserAction())); > aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, > sourcePerm.getGroupAction())); > aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, > sourcePerm.getOtherAction())); > } > } > {code} > We found that DEFAULT extended ACL rules were not being inherited properly by > the directory sub-tree, so the above code is incomplete because it > effectively drops the DEFAULT rules. The second problem is with the call to > {{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended > ACLs. When extended ACLs are used the GROUP permission is replaced with the > extended ACL mask. So the above code will apply the wrong permissions to the > GROUP. Instead the correct GROUP permissions now need to be pulled from the > AclEntry as returned by {{getAclStatus().getEntries()}}. See the > implementation of the new method {{getDefaultAclEntries}} for details. > Similar issues exist with the HCatalog API. None of the API accounts for > setting extended ACLs on the directory sub-tree. The changes to the HCatalog > API allow the extended ACLs to be passed into the required methods similar to > how basic permissions are passed in. When building the directory sub-tree the > extended ACLs of the table directory are inherited by all sub-directories, > including the DEFAULT rules. > Replicating the problem: > Create a table to write data into (I will use acl_test as the destination and > words_text as the source) and set the ACLs as follows: > {noformat} > $ hdfs dfs -setfacl -m > default:user::rwx,default:group::r-x,default:mask::rwx,default:user:hdfs:rwx,group::r-x,user:hdfs:rwx > /user/cdrome/hive/acl_test > $ hdfs dfs -ls -d /user/cdrome/hive/acl_test > drwxrwx---+ - cdrome hdfs 0 2016-07-13 20:36 > /user/cdrome/hive/acl_test > $ hdfs dfs -getfacl -R /user/cdrome/hive/acl_test > # file
[jira] [Comment Edited] (HIVE-13989) Extended ACLs are not handled according to specification
[ https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156150#comment-16156150 ] Chris Drome edited comment on HIVE-13989 at 9/6/17 10:46 PM: - Ran tests locally before and after the patch on branch-2 and none of the failures appear to be attributable to the patch: || Test || branch-2 HEAD (b3a6e52) || branch-2 HEAD + HIVE-13989 || | org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] | PASSED | PASSED | | org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explaindenpendencydiffengs] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] | PASSED | PASSED | | org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure | FAILED | FAILED | | org.apache.hive.jdbc.TestJdbcDriver2.testYarnATSGuid | PASSED | PASSED | was (Author: cdrome): Ran tests local before and after the patch on branch-2 and none of the failures appear to be attributable to the patch: || Test || branch-2 HEAD (b3a6e52) || branch-2 HEAD + HIVE-13989 || | org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] | PASSED | PASSED | | org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explaindenpendencydiffengs] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] | FAILED | FAILED | | org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] | PASSED | PASSED | | org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure | FAILED | FAILED | | org.apache.hive.jdbc.TestJdbcDriver2.testYarnATSGuid | PASSED | PASSED | > Extended ACLs are not handled according to specification > > > Key: HIVE-13989 > URL: https://issues.apache.org/jira/browse/HIVE-13989 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 1.2.1, 2.0.0 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13989.1-branch-1.patch, HIVE-13989.1.patch, > HIVE-13989.4-branch-2.2.patch, HIVE-13989.4-branch-2.patch, > HIVE-13989-branch-1.patch, HIVE-13989-branch-2.2.patch, > HIVE-13989-branch-2.2.patch, HIVE-13989-branch-2.2.patch > > > Hive takes two approaches to working with extended ACLs depending on whether > data is being produced via a Hive query or HCatalog APIs. A Hive query will > run an FsShell command to recursively set the extended ACLs for a directory > sub-tree. HCatalog APIs will attempt to build up the directory sub-tree > programmatically and runs some code to set the ACLs to match the parent > directory. > Some incorrect assumptions were made when implementing the extended ACLs > support. Refer to https://issues.apache.org/jira/browse/HDFS-4685 for the > design documents of extended ACLs in HDFS. These documents model the > implementation after the POSIX implementation on Linux, which can be found at > http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html. > The code for setting extended ACLs via HCatalog APIs is found in > HdfsUtils.java: > {code} > if (aclEnabled) { > aclStatus = sourceStatus.getAclStatus(); > if (aclStatus != null) { > LOG.trace(aclStatus.toString()); > aclEntries = aclStatus.getEntries(); > removeBaseAclEntries(aclEntries); > //the ACL api's also expect the tradition user/group/other permission > in the form of ACL > aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.USER, > sourcePerm.getUserAction())); > aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.GROUP, > sourcePerm.getGroupAction())); > aclEntries.add(newAclEntry(AclEntryScope.ACCESS, AclEntryType.OTHER, > sourcePerm.getOtherAction())); > } > } > {code} > We found that DEFAULT extended ACL rules were not being inherited properly by > the directory sub-tree, so the above code is incomplete because it > effectively drops the DEFAULT rules. The second problem is with the call to > {{sourcePerm.getGroupAction()}}, which is incorrect in the case of extended > ACLs. When extended ACLs are used the GROUP permission is replaced with the > extended ACL mask. So the above code will apply the wrong permissions to the > GROUP. Instead the correct GROUP permissions now need to be pulled from the > AclEntry as returned by {{getAclStatus().getEntries()}}. See the > implementation of the new method {{getDefaultAclEntries}} for details. > Similar issues exist with th
[jira] [Commented] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156148#comment-16156148 ] slim bouguerra commented on HIVE-17468: --- [~jcamachorodriguez] not sure am following your thought, if we exclude all the druid stuff then what is left is not what we want is going to be either hive jackson ones or from other transitive dependency. My take on this is that the lib that are needed and brought by druid it will be shaded anyway so no druid-jackson will be on the class path. Please let me know if that make sens ? > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-17468.patch, hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17470) eliminate potential vector copies when merging ACID deltas in LLAP IO path
[ https://issues.apache.org/jira/browse/HIVE-17470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17470: --- Assignee: (was: Sergey Shelukhin) > eliminate potential vector copies when merging ACID deltas in LLAP IO path > -- > > Key: HIVE-17470 > URL: https://issues.apache.org/jira/browse/HIVE-17470 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > See the comments on HIVE-12631. Probably LlapRecordReader should be able to > receive VRBs directly; that or ACID reader should be able to operate on > either CVB or VRB. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17470) eliminate potential vector copies when merging ACID deltas in LLAP IO path
[ https://issues.apache.org/jira/browse/HIVE-17470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17470: --- Assignee: Sergey Shelukhin > eliminate potential vector copies when merging ACID deltas in LLAP IO path > -- > > Key: HIVE-17470 > URL: https://issues.apache.org/jira/browse/HIVE-17470 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > See the comments on HIVE-12631. Probably LlapRecordReader should be able to > receive VRBs directly; that or ACID reader should be able to operate on > either CVB or VRB. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables
[ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12631: Attachment: HIVE-12631.27.patch Updated the patch. Not sure why the config was set in UpdateDeleteSemanticAnalyzer so I commented that out for now. I looked a bit at the CVB-VRB-CVB-VRB conversion, given that handling a selected vector after ACID reader requires copying stuff, it doesn't seem ideal. Can be handled in a followup. Either a selected vector can be added to CVB and the ACID merger thing made operate on both (the code is common between the two), or LLAPRecordReader can be enabled to accept VRBs directly. > LLAP: support ORC ACID tables > - > > Key: HIVE-12631 > URL: https://issues.apache.org/jira/browse/HIVE-12631 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, > HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, > HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, > HIVE-12631.17.patch, HIVE-12631.18.patch, HIVE-12631.19.patch, > HIVE-12631.1.patch, HIVE-12631.20.patch, HIVE-12631.21.patch, > HIVE-12631.22.patch, HIVE-12631.23.patch, HIVE-12631.24.patch, > HIVE-12631.25.patch, HIVE-12631.26.patch, HIVE-12631.27.patch, > HIVE-12631.2.patch, HIVE-12631.3.patch, HIVE-12631.4.patch, > HIVE-12631.5.patch, HIVE-12631.6.patch, HIVE-12631.7.patch, > HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch > > > LLAP uses a completely separate read path in ORC to allow for caching and > parallelization of reads and processing. This path does not support ACID. As > far as I remember ACID logic is embedded inside ORC format; we need to > refactor it to be on top of some interface, if practical; or just port it to > LLAP read path. > Another consideration is how the logic will work with cache. The cache is > currently low-level (CB-level in ORC), so we could just use it to read bases > and deltas (deltas should be cached with higher priority) and merge as usual. > We could also cache merged representation in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-17469) The HiveMetaStoreClient should randomize the connection to HMS HA
[ https://issues.apache.org/jira/browse/HIVE-17469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña resolved HIVE-17469. Resolution: Invalid > The HiveMetaStoreClient should randomize the connection to HMS HA > - > > Key: HIVE-17469 > URL: https://issues.apache.org/jira/browse/HIVE-17469 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 >Reporter: Sergio Peña > > In an environment with multiple HMS servers, the HiveMetaStoreClient class > selects the 1st URI to connect on every open() connection. We should > randomize that connection to help balancing the HMS servers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17469) The HiveMetaStoreClient should randomize the connection to HMS HA
[ https://issues.apache.org/jira/browse/HIVE-17469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156115#comment-16156115 ] Sergio Peña commented on HIVE-17469: Ah, thanks, I was looking to an old code. > The HiveMetaStoreClient should randomize the connection to HMS HA > - > > Key: HIVE-17469 > URL: https://issues.apache.org/jira/browse/HIVE-17469 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 >Reporter: Sergio Peña > > In an environment with multiple HMS servers, the HiveMetaStoreClient class > selects the 1st URI to connect on every open() connection. We should > randomize that connection to help balancing the HMS servers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17469) The HiveMetaStoreClient should randomize the connection to HMS HA
[ https://issues.apache.org/jira/browse/HIVE-17469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156112#comment-16156112 ] Vihang Karajgaonkar commented on HIVE-17469: [~spena] isn't it already doing it here? https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java#L207 > The HiveMetaStoreClient should randomize the connection to HMS HA > - > > Key: HIVE-17469 > URL: https://issues.apache.org/jira/browse/HIVE-17469 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.1 >Reporter: Sergio Peña > > In an environment with multiple HMS servers, the HiveMetaStoreClient class > selects the 1st URI to connect on every open() connection. We should > randomize that connection to help balancing the HMS servers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156105#comment-16156105 ] Jesus Camacho Rodriguez commented on HIVE-17468: [~bslim], for this fix wouldn't it be enough to exclude those coming from {{calcite-druid}}? Those three jackson dependencies that we exclude, we are including them into the uber jar via shading/renaming. As far as I remember, otherwise they might create conflicts with those used by Hive. > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-17468.patch, hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17456) Set current database for external LLAP interface
[ https://issues.apache.org/jira/browse/HIVE-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17456: -- Status: Patch Available (was: Open) > Set current database for external LLAP interface > > > Key: HIVE-17456 > URL: https://issues.apache.org/jira/browse/HIVE-17456 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17456.1.patch, HIVE-17456.2.patch > > > Currently the query passed in to external LLAP client has the default DB as > the current database. > Allow user to specify a different current database. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17455) External LLAP client: connection to HS2 should be kept open until explicitly closed
[ https://issues.apache.org/jira/browse/HIVE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17455: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master > External LLAP client: connection to HS2 should be kept open until explicitly > closed > --- > > Key: HIVE-17455 > URL: https://issues.apache.org/jira/browse/HIVE-17455 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Fix For: 3.0.0 > > Attachments: HIVE-17455.1.patch, HIVE-17455.2.patch, > HIVE-17455.3.patch > > > In the case that a complex query (aggregation/join) is passed to external > LLAP client, the query result is first saved as a Hive temp table before > being read by LLAP to client. Currently the HS2 connection used to fetch the > LLAP splits is closed right after the splits are fetched, which means the > temp table is gone by the time LLAP tries to read it. > Try to keep the connection open so that the table is still around when LLAP > tries to read it. Add close methods which can be used to close the connection > when the client is done with the query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17455) External LLAP client: connection to HS2 should be kept open until explicitly closed
[ https://issues.apache.org/jira/browse/HIVE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156090#comment-16156090 ] Hive QA commented on HIVE-17455: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885642/HIVE-17455.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11028 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testWriteSetTracking1 (batchId=282) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6697/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6697/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6697/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12885642 - PreCommit-HIVE-Build > External LLAP client: connection to HS2 should be kept open until explicitly > closed > --- > > Key: HIVE-17455 > URL: https://issues.apache.org/jira/browse/HIVE-17455 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17455.1.patch, HIVE-17455.2.patch, > HIVE-17455.3.patch > > > In the case that a complex query (aggregation/join) is passed to external > LLAP client, the query result is first saved as a Hive temp table before > being read by LLAP to client. Currently the HS2 connection used to fetch the > LLAP splits is closed right after the splits are fetched, which means the > temp table is gone by the time LLAP tries to read it. > Try to keep the connection open so that the table is still around when LLAP > tries to read it. Add close methods which can be used to close the connection > when the client is done with the query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] slim bouguerra updated HIVE-17468: -- Attachment: HIVE-17468.patch > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-17468.patch, hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] slim bouguerra updated HIVE-17468: -- Status: Patch Available (was: Open) > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] slim bouguerra reassigned HIVE-17468: - Assignee: Jesus Camacho Rodriguez > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17468) Shade and package appropriate jackson version for druid storage handler
[ https://issues.apache.org/jira/browse/HIVE-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] slim bouguerra updated HIVE-17468: -- Attachment: hive-druid-deps.txt > Shade and package appropriate jackson version for druid storage handler > --- > > Key: HIVE-17468 > URL: https://issues.apache.org/jira/browse/HIVE-17468 > Project: Hive > Issue Type: Bug >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: hive-druid-deps.txt > > > Currently we are excluding all the jackson core dependencies coming from > druid. This is wrong in my opinion since this will lead to the packaging of > unwanted jackson library from other projects. > As you can see the file hive-druid-deps.txt currently jacskon core is coming > from calcite and the version is 2.6.3 which is very different from 2.4.6 used > by druid. This patch exclude the unwanted jars and make sure to bring in > druid jackson dependency from druid it self. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17459) View deletion operation failed to replicate on target cluster
[ https://issues.apache.org/jira/browse/HIVE-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156059#comment-16156059 ] Thejas M Nair commented on HIVE-17459: -- [~taoli-hwx] Can you also please add a unit test ? > View deletion operation failed to replicate on target cluster > - > > Key: HIVE-17459 > URL: https://issues.apache.org/jira/browse/HIVE-17459 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-17459.1.patch > > > View dropping is not replicated during incremental repl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17431) change configuration handling in TezSessionState
[ https://issues.apache.org/jira/browse/HIVE-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156046#comment-16156046 ] Siddharth Seth commented on HIVE-17431: --- {code} refreshLocalResourcesFromConf(conf); {code} in openInternal seems to be a potential problem area. Either it is missing LRs for the new session, or this code should not exist anymore. For the most part, I suspect some of the other parameters in this class can be made final as well. Unrelated to the patch: - There's places where the queue apparently gets changed from TezSessionPool. Didn't know a single SessionState could be moved across queues. Seems unnecessary. - replaceSession - maybe simpler to move the implementation into TezSessionState itself. e.g. additionLocalResourcesNotFromConf is fetched and then passed back in to the open method... > change configuration handling in TezSessionState > > > Key: HIVE-17431 > URL: https://issues.apache.org/jira/browse/HIVE-17431 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17431.patch > > > The configuration is only set when opening the session; that seems > unnecessary - it could be set in the ctor and made final. E.g. when updating > the session and localizing new resources we may theoretically open the > session with a new config, but we don't update the config and only update the > files if the session is already open, which seems to imply that it's ok to > not update the config. > In most cases, the session is opened only once or reopened without intending > to change the config (e.g. if it times out). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17455) External LLAP client: connection to HS2 should be kept open until explicitly closed
[ https://issues.apache.org/jira/browse/HIVE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17455: -- Attachment: HIVE-17455.3.patch adding some comments > External LLAP client: connection to HS2 should be kept open until explicitly > closed > --- > > Key: HIVE-17455 > URL: https://issues.apache.org/jira/browse/HIVE-17455 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17455.1.patch, HIVE-17455.2.patch, > HIVE-17455.3.patch > > > In the case that a complex query (aggregation/join) is passed to external > LLAP client, the query result is first saved as a Hive temp table before > being read by LLAP to client. Currently the HS2 connection used to fetch the > LLAP splits is closed right after the splits are fetched, which means the > temp table is gone by the time LLAP tries to read it. > Try to keep the connection open so that the table is still around when LLAP > tries to read it. Add close methods which can be used to close the connection > when the client is done with the query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17429) Hive JDBC doesn't return rows when querying Impala
[ https://issues.apache.org/jira/browse/HIVE-17429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156030#comment-16156030 ] Zach Amsden commented on HIVE-17429: org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] looks like a difference in test output caused by a WARNING line output in different order. Running org.apache.hadoop.hive.cli.TestAccumuloCliDriver definitely looks like a timeout, this is the last output from Maven: {noformat} [INFO] [INFO] --- maven-surefire-plugin:2.18.1:test (default-test) @ hive-it-qfile-accumulo --- [INFO] Surefire report directory: /home/hiveptest/35.193.110.99-hiveptest-0/apache-github-source-source/itests/qtest-accumulo/target/surefire-reports --- T E S T S --- Running org.apache.hadoop.hive.cli.TestAccumuloCliDriver {noformat} > Hive JDBC doesn't return rows when querying Impala > -- > > Key: HIVE-17429 > URL: https://issues.apache.org/jira/browse/HIVE-17429 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 2.1.0 >Reporter: Zach Amsden >Assignee: Zach Amsden > Fix For: 2.1.0 > > Attachments: HIVE-17429.1.patch, HIVE-17429.2.patch > > > The Hive JDBC driver used to return a result set when querying Impala. Now, > instead, it gets data back but interprets the data as query logs instead of a > resultSet. This causes many issues (we see complaints about beeline as well > as test failures). > This appears to be a regression introduced with asynchronous operation > against Hive. > Ideally, we could make both behaviors work. I have a simple patch that > should fix the problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17464) Fix to be able to disable max shuffle size DHJ config
[ https://issues.apache.org/jira/browse/HIVE-17464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156005#comment-16156005 ] Hive QA commented on HIVE-17464: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885637/HIVE-17464.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11027 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestTxnCommandsBase - did not produce a TEST-*.xml file (likely timed out) (batchId=280) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=234) org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testWriteSetTracking6 (batchId=282) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6696/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6696/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6696/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12885637 - PreCommit-HIVE-Build > Fix to be able to disable max shuffle size DHJ config > - > > Key: HIVE-17464 > URL: https://issues.apache.org/jira/browse/HIVE-17464 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-17464.patch > > > Setting {{hive.auto.convert.join.shuffle.max.size}} to -1 does not work as > expected. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17467) HCatClient APIs for discovering partition key-values
[ https://issues.apache.org/jira/browse/HIVE-17467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17467: Attachment: HIVE-17467.1.patch > HCatClient APIs for discovering partition key-values > > > Key: HIVE-17467 > URL: https://issues.apache.org/jira/browse/HIVE-17467 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Metastore >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-17467.1.patch > > > This is a followup to HIVE-17466, which adds the {{HiveMetaStore}} level call > to retrieve unique combinations of part-key values that satisfy a specified > predicate. > Attached herewith are the {{HCatClient}} APIs that will be used by Apache > Oozie, before launching workflows. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17467) HCatClient APIs for discovering partition key-values
[ https://issues.apache.org/jira/browse/HIVE-17467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan reassigned HIVE-17467: --- > HCatClient APIs for discovering partition key-values > > > Key: HIVE-17467 > URL: https://issues.apache.org/jira/browse/HIVE-17467 > Project: Hive > Issue Type: New Feature > Components: HCatalog, Metastore >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > > This is a followup to HIVE-17466, which adds the {{HiveMetaStore}} level call > to retrieve unique combinations of part-key values that satisfy a specified > predicate. > Attached herewith are the {{HCatClient}} APIs that will be used by Apache > Oozie, before launching workflows. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17387) implement Tez AM registry in Hive
[ https://issues.apache.org/jira/browse/HIVE-17387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17387: Attachment: HIVE-17387.01.patch > implement Tez AM registry in Hive > - > > Key: HIVE-17387 > URL: https://issues.apache.org/jira/browse/HIVE-17387 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17387.01.patch, HIVE-17387.patch > > > Necessary for HS2 HA, to transfer AMs between HS2s, etc. > Helpful for workload management. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17466) Metastore API to list unique partition-key-value combinations
[ https://issues.apache.org/jira/browse/HIVE-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17466: Status: Patch Available (was: Open) > Metastore API to list unique partition-key-value combinations > - > > Key: HIVE-17466 > URL: https://issues.apache.org/jira/browse/HIVE-17466 > Project: Hive > Issue Type: New Feature > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-17466.1.patch > > > Raising this on behalf of [~thiruvel], who wrote this initially as part of a > tangential "data-discovery" system. > Programs like Apache Oozie, Apache Falcon (or Yahoo GDM), etc. launch > workflows based on the availability of table/partitions. Partitions are > currently discovered by listing partitions using (what boils down to) > {{HiveMetaStoreClient.listPartitions()}}. This can be slow and cumbersome, > given that {{Partition}} objects are heavyweight and carry redundant > information. The alternative is to use partition-names, which will need > client-side parsing to extract part-key values. > When checking which hourly partitions for a particular day have been > published already, it would be preferable to have an API that pushed down > part-key extraction into the {{RawStore}} layer, and returned key-values as > the result. This would be similar to how {{SELECT DISTINCT part_key FROM > my_table;}} would run, but at the {{HiveMetaStoreClient}} level. > Here's what we've been using at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17466) Metastore API to list unique partition-key-value combinations
[ https://issues.apache.org/jira/browse/HIVE-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17466: Attachment: HIVE-17466.1.patch > Metastore API to list unique partition-key-value combinations > - > > Key: HIVE-17466 > URL: https://issues.apache.org/jira/browse/HIVE-17466 > Project: Hive > Issue Type: New Feature > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-17466.1.patch > > > Raising this on behalf of [~thiruvel], who wrote this initially as part of a > tangential "data-discovery" system. > Programs like Apache Oozie, Apache Falcon (or Yahoo GDM), etc. launch > workflows based on the availability of table/partitions. Partitions are > currently discovered by listing partitions using (what boils down to) > {{HiveMetaStoreClient.listPartitions()}}. This can be slow and cumbersome, > given that {{Partition}} objects are heavyweight and carry redundant > information. The alternative is to use partition-names, which will need > client-side parsing to extract part-key values. > When checking which hourly partitions for a particular day have been > published already, it would be preferable to have an API that pushed down > part-key extraction into the {{RawStore}} layer, and returned key-values as > the result. This would be similar to how {{SELECT DISTINCT part_key FROM > my_table;}} would run, but at the {{HiveMetaStoreClient}} level. > Here's what we've been using at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17466) Metastore API to list unique partition-key-value combinations
[ https://issues.apache.org/jira/browse/HIVE-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan reassigned HIVE-17466: --- > Metastore API to list unique partition-key-value combinations > - > > Key: HIVE-17466 > URL: https://issues.apache.org/jira/browse/HIVE-17466 > Project: Hive > Issue Type: New Feature > Components: Metastore >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Thiruvel Thirumoolan > > Raising this on behalf of [~thiruvel], who wrote this initially as part of a > tangential "data-discovery" system. > Programs like Apache Oozie, Apache Falcon (or Yahoo GDM), etc. launch > workflows based on the availability of table/partitions. Partitions are > currently discovered by listing partitions using (what boils down to) > {{HiveMetaStoreClient.listPartitions()}}. This can be slow and cumbersome, > given that {{Partition}} objects are heavyweight and carry redundant > information. The alternative is to use partition-names, which will need > client-side parsing to extract part-key values. > When checking which hourly partitions for a particular day have been > published already, it would be preferable to have an API that pushed down > part-key extraction into the {{RawStore}} layer, and returned key-values as > the result. This would be similar to how {{SELECT DISTINCT part_key FROM > my_table;}} would run, but at the {{HiveMetaStoreClient}} level. > Here's what we've been using at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17459) View deletion operation failed to replicate on target cluster
[ https://issues.apache.org/jira/browse/HIVE-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-17459: -- Status: Patch Available (was: Open) > View deletion operation failed to replicate on target cluster > - > > Key: HIVE-17459 > URL: https://issues.apache.org/jira/browse/HIVE-17459 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-17459.1.patch > > > View dropping is not replicated during incremental repl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17459) View deletion operation failed to replicate on target cluster
[ https://issues.apache.org/jira/browse/HIVE-17459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-17459: -- Attachment: HIVE-17459.1.patch > View deletion operation failed to replicate on target cluster > - > > Key: HIVE-17459 > URL: https://issues.apache.org/jira/browse/HIVE-17459 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-17459.1.patch > > > View dropping is not replicated during incremental repl. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17455) External LLAP client: connection to HS2 should be kept open until explicitly closed
[ https://issues.apache.org/jira/browse/HIVE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155951#comment-16155951 ] Sergey Shelukhin commented on HIVE-17455: - +1. Looks like there's no better way that doesn't involve a lot of work and network calls to know when all the splits are done. > External LLAP client: connection to HS2 should be kept open until explicitly > closed > --- > > Key: HIVE-17455 > URL: https://issues.apache.org/jira/browse/HIVE-17455 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17455.1.patch, HIVE-17455.2.patch > > > In the case that a complex query (aggregation/join) is passed to external > LLAP client, the query result is first saved as a Hive temp table before > being read by LLAP to client. Currently the HS2 connection used to fetch the > LLAP splits is closed right after the splits are fetched, which means the > temp table is gone by the time LLAP tries to read it. > Try to keep the connection open so that the table is still around when LLAP > tries to read it. Add close methods which can be used to close the connection > when the client is done with the query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155930#comment-16155930 ] Chaozhong Yang commented on HIVE-17460: --- For autoColumnStats_5.q, the difference between my results and original q.out is: < cint < dstring After running `alter table partitioned1 add columns(c int, d string)` and `desc formatted partitioned1 partition(part=1)`, the right result should contains `c` and `d`. Maybe I should re-generate those q.out files which contains wrong results ? [~wzheng] > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17393) AMReporter need hearbeat every external 'AM'
[ https://issues.apache.org/jira/browse/HIVE-17393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17393: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to master. Thanks for the patch! > AMReporter need hearbeat every external 'AM' > > > Key: HIVE-17393 > URL: https://issues.apache.org/jira/browse/HIVE-17393 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Zhiyuan Yang >Assignee: Zhiyuan Yang > Fix For: 3.0.0 > > Attachments: HIVE-17393.1.patch, HIVE-17393.2.patch, > HIVE-17393.3.patch > > > AMReporter only remember first AM that submit the query and heartbeat to it. > In case of external client, there might be multiple 'AM's and every of them > need node heartbeat. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17450) rename TestTxnCommandsBase
[ https://issues.apache.org/jira/browse/HIVE-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-17450: -- Attachment: HIVE-17450.02.patch Errors should not be related, but running the tests again anyway > rename TestTxnCommandsBase > --- > > Key: HIVE-17450 > URL: https://issues.apache.org/jira/browse/HIVE-17450 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Peter Vary > Attachments: HIVE-17450.02.patch, HIVE-17450.patch > > > TestTxnCommandsBase is an abstract class, added in HIVE-17205; it matches the > maven test pattern...because of that there is a failining test in every test > output -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17450) rename TestTxnCommandsBase
[ https://issues.apache.org/jira/browse/HIVE-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155903#comment-16155903 ] Hive QA commented on HIVE-17450: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12885582/HIVE-17450.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11027 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6695/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6695/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6695/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12885582 - PreCommit-HIVE-Build > rename TestTxnCommandsBase > --- > > Key: HIVE-17450 > URL: https://issues.apache.org/jira/browse/HIVE-17450 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Peter Vary > Attachments: HIVE-17450.patch > > > TestTxnCommandsBase is an abstract class, added in HIVE-17205; it matches the > maven test pattern...because of that there is a failining test in every test > output -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-17465: --- Component/s: Statistics Physical Optimizer > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned HIVE-17465: -- Assignee: Vineet Garg > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155867#comment-16155867 ] Chaozhong Yang edited comment on HIVE-17460 at 9/6/17 6:41 PM: --- [~wei.zheng] Yes, I have moved those code into alterPartitionSpecInMemory and submit patch again. was (Author: debugger87): [~wei.zheng] Yes, I have move those code into alterPartitionSpecInMemory. > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155867#comment-16155867 ] Chaozhong Yang commented on HIVE-17460: --- [~wei.zheng] Yes, I have move those code into alterPartitionSpecInMemory. > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17456) Set current database for external LLAP interface
[ https://issues.apache.org/jira/browse/HIVE-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155864#comment-16155864 ] Sergey Shelukhin commented on HIVE-17456: - +1 pending tests > Set current database for external LLAP interface > > > Key: HIVE-17456 > URL: https://issues.apache.org/jira/browse/HIVE-17456 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17456.1.patch, HIVE-17456.2.patch > > > Currently the query passed in to external LLAP client has the default DB as > the current database. > Allow user to specify a different current database. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17456) Set current database for external LLAP interface
[ https://issues.apache.org/jira/browse/HIVE-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155864#comment-16155864 ] Sergey Shelukhin edited comment on HIVE-17456 at 9/6/17 6:40 PM: - +1 pending HiveQA was (Author: sershe): +1 pending tests > Set current database for external LLAP interface > > > Key: HIVE-17456 > URL: https://issues.apache.org/jira/browse/HIVE-17456 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17456.1.patch, HIVE-17456.2.patch > > > Currently the query passed in to external LLAP client has the default DB as > the current database. > Allow user to specify a different current database. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17455) External LLAP client: connection to HS2 should be kept open until explicitly closed
[ https://issues.apache.org/jira/browse/HIVE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155861#comment-16155861 ] Sergey Shelukhin commented on HIVE-17455: - Hm... this seems very error prone, esp. if users don't even set the handle in the jobconf. Perhaps some method (init of some sort? or getSplits itself, since it's called explicitly) should return a closeable encapsulating the handle, so that it could be closed via normal means? > External LLAP client: connection to HS2 should be kept open until explicitly > closed > --- > > Key: HIVE-17455 > URL: https://issues.apache.org/jira/browse/HIVE-17455 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17455.1.patch, HIVE-17455.2.patch > > > In the case that a complex query (aggregation/join) is passed to external > LLAP client, the query result is first saved as a Hive temp table before > being read by LLAP to client. Currently the HS2 connection used to fetch the > LLAP splits is closed right after the splits are fetched, which means the > temp table is gone by the time LLAP tries to read it. > Try to keep the connection open so that the table is still around when LLAP > tries to read it. Add close methods which can be used to close the connection > when the client is done with the query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17456) Set current database for external LLAP interface
[ https://issues.apache.org/jira/browse/HIVE-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17456: -- Attachment: HIVE-17456.2.patch Updated tests > Set current database for external LLAP interface > > > Key: HIVE-17456 > URL: https://issues.apache.org/jira/browse/HIVE-17456 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17456.1.patch, HIVE-17456.2.patch > > > Currently the query passed in to external LLAP client has the default DB as > the current database. > Allow user to specify a different current database. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaozhong Yang updated HIVE-17460: -- Attachment: HIVE-17460.2.patch > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17460) `insert overwrite` should support table schema evolution (e.g. add columns)
[ https://issues.apache.org/jira/browse/HIVE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155845#comment-16155845 ] Wei Zheng commented on HIVE-17460: -- Some existing q.out files are wrong, but I noticed some other failures, e.g. autoColumnStats_5.q. I suggest you try moving the fix into alterPartitionSpecInMemory, under the "if (inheritTableSpecs)" block and have another test run. > `insert overwrite` should support table schema evolution (e.g. add columns) > --- > > Key: HIVE-17460 > URL: https://issues.apache.org/jira/browse/HIVE-17460 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0, 2.2.0 >Reporter: Chaozhong Yang >Assignee: Chaozhong Yang > Fix For: 3.0.0 > > Attachments: HIVE-17460.2.patch, HIVE-17460.patch > > > In Hive, adding columns into original table is a common use case. However, if > we insert overwrite older partitions after adding columns, added columns will > not be accessed. > ``` > create table src_table( > i int > ) > PARTITIONED BY (`date` string); > insert overwrite table src_table partition(`date`='20170905') valu > es (3); > select * from src_table where `date` = '20170905'; > alter table src_table add columns (bi bigint); > insert overwrite table src_table partition(`date`='20170905') valu > es (3, 5); > select * from src_table where `date` = '20170905'; > ``` > The result will be as follows: > ``` > 3, NULL, '20170905' > ``` > Obviously, it doesn't meet our expectation. The expected result should be: > ``` > 3, 5, '20170905' > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17455) External LLAP client: connection to HS2 should be kept open until explicitly closed
[ https://issues.apache.org/jira/browse/HIVE-17455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17455: -- Attachment: HIVE-17455.2.patch updated patch to make map final. > External LLAP client: connection to HS2 should be kept open until explicitly > closed > --- > > Key: HIVE-17455 > URL: https://issues.apache.org/jira/browse/HIVE-17455 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17455.1.patch, HIVE-17455.2.patch > > > In the case that a complex query (aggregation/join) is passed to external > LLAP client, the query result is first saved as a Hive temp table before > being read by LLAP to client. Currently the HS2 connection used to fetch the > LLAP splits is closed right after the splits are fetched, which means the > temp table is gone by the time LLAP tries to read it. > Try to keep the connection open so that the table is still around when LLAP > tries to read it. Add close methods which can be used to close the connection > when the client is done with the query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17464) Fix to be able to disable max shuffle size DHJ config
[ https://issues.apache.org/jira/browse/HIVE-17464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17464: --- Attachment: HIVE-17464.patch > Fix to be able to disable max shuffle size DHJ config > - > > Key: HIVE-17464 > URL: https://issues.apache.org/jira/browse/HIVE-17464 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-17464.patch > > > Setting {{hive.auto.convert.join.shuffle.max.size}} to -1 does not work as > expected. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-17464) Fix to be able to disable max shuffle size DHJ config
[ https://issues.apache.org/jira/browse/HIVE-17464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17464 started by Jesus Camacho Rodriguez. -- > Fix to be able to disable max shuffle size DHJ config > - > > Key: HIVE-17464 > URL: https://issues.apache.org/jira/browse/HIVE-17464 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Setting {{hive.auto.convert.join.shuffle.max.size}} to -1 does not work as > expected. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17464) Fix to be able to disable max shuffle size DHJ config
[ https://issues.apache.org/jira/browse/HIVE-17464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17464: --- Status: Patch Available (was: In Progress) > Fix to be able to disable max shuffle size DHJ config > - > > Key: HIVE-17464 > URL: https://issues.apache.org/jira/browse/HIVE-17464 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > > Setting {{hive.auto.convert.join.shuffle.max.size}} to -1 does not work as > expected. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17463) ORC: include orc-shims in hive-exec.jar
[ https://issues.apache.org/jira/browse/HIVE-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155807#comment-16155807 ] Owen O'Malley commented on HIVE-17463: -- This is part of upgrading Hive trunk to use the upcoming ORC 1.5.0 release. > ORC: include orc-shims in hive-exec.jar > --- > > Key: HIVE-17463 > URL: https://issues.apache.org/jira/browse/HIVE-17463 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Minor > Attachments: HIVE-17463.1.patch > > > ORC-234 added a new shims module - this needs to be part of hive-exec shading > to use ORC-1.5.x branch in Hive. -- This message was sent by Atlassian JIRA (v6.4.14#64029)