[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Status: Patch Available (was: In Progress) > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.095.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Status: In Progress (was: Patch Available) > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.095.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16778) LLAP IO: better refcount management
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027213#comment-16027213 ] Hive QA commented on HIVE-16778: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870176/HIVE-16778.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[named_column_join] (batchId=72) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5455/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5455/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5455/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870176 - PreCommit-HIVE-Build > LLAP IO: better refcount management > --- > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16778.patch, HIVE-16778.patch > > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16600) Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in multi_insert cases
[ https://issues.apache.org/jira/browse/HIVE-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027207#comment-16027207 ] liyunzhang_intel commented on HIVE-16600: - [~lirui]: actually the algorithms in HIVE-16600.9.patch is similar as yours. The safe way to judge an order by limit case is 1. verify whether there is a limit between current RS to next RS/FS in non multi-insert case. 2. verify whether there is a limit between current RS to jointOperator in multi-insert case. Here jointOperator is the operator where the branches start. > Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel > order by in multi_insert cases > > > Key: HIVE-16600 > URL: https://issues.apache.org/jira/browse/HIVE-16600 > Project: Hive > Issue Type: Sub-task >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-16600.1.patch, HIVE-16600.2.patch, > HIVE-16600.3.patch, HIVE-16600.4.patch, HIVE-16600.5.patch, > HIVE-16600.6.patch, HIVE-16600.7.patch, HIVE-16600.8.patch, > HIVE-16600.9.patch, mr.explain, mr.explain.log.HIVE-16600 > > > multi_insert_gby.case.q > {code} > set hive.exec.reducers.bytes.per.reducer=256; > set hive.optimize.sampling.orderby=true; > drop table if exists e1; > drop table if exists e2; > create table e1 (key string, value string); > create table e2 (key string); > FROM (select key, cast(key as double) as keyD, value from src order by key) a > INSERT OVERWRITE TABLE e1 > SELECT key, value > INSERT OVERWRITE TABLE e2 > SELECT key; > select * from e1; > select * from e2; > {code} > the parallelism of Sort is 1 even we enable parallel order > by("hive.optimize.sampling.orderby" is set as "true"). This is not > reasonable because the parallelism should be calcuated by > [Utilities.estimateReducers|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L170] > this is because SetSparkReducerParallelism#needSetParallelism returns false > when [children size of > RS|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L207] > is greater than 1. > in this case, the children size of {{RS[2]}} is two. > the logical plan of the case > {code} >TS[0]-SEL[1]-RS[2]-SEL[3]-SEL[4]-FS[5] > -SEL[6]-FS[7] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16285) Servlet for dynamically configuring log levels
[ https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027206#comment-16027206 ] Lefty Leverenz commented on HIVE-16285: --- Does this need to be documented in the wiki? Here's the logging section: * [Getting Started -- Hive Logging | https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-HiveLogging] > Servlet for dynamically configuring log levels > -- > > Key: HIVE-16285 > URL: https://issues.apache.org/jira/browse/HIVE-16285 > Project: Hive > Issue Type: Improvement > Components: Logging >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 3.0.0 > > Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, > HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, > HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch, HIVE-16285.7.patch > > > Many long running services like HS2, LLAP etc. will benefit from having an > endpoint to dynamically change log levels for various loggers. This will help > greatly with debuggability without requiring a restart of the service. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16760) Update errata.txt for HIVE-16743
[ https://issues.apache.org/jira/browse/HIVE-16760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027198#comment-16027198 ] Lefty Leverenz commented on HIVE-16760: --- [~thejas], shouldn't all changes to the code be tracked in JIRA? I agree that updating errata.txt is a minor change and doesn't need any review, but I'm uneasy about doing it without a JIRA ticket. Since I tend to catch many of these errors, I'd like to give good advice on how to do the updates. Should we have a discussion on the dev@hive mailing list? Or else we could open a JIRA ticket to document errata.txt in the wiki and we could have the discussion in the comments. > Update errata.txt for HIVE-16743 > > > Key: HIVE-16760 > URL: https://issues.apache.org/jira/browse/HIVE-16760 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-16760.patch > > > Refer to: > https://issues.apache.org/jira/browse/HIVE-16743?focusedCommentId=16024139=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16024139 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16600) Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in multi_insert cases
[ https://issues.apache.org/jira/browse/HIVE-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027194#comment-16027194 ] Rui Li commented on HIVE-16600: --- [~kellyzly], one example of non multi insert having branches is dynamic partition pruning. I agree such cases may also be eligible for parallel order by, but that needs further investigation. So let's limit the scope to multi insert here. Does that make sense? > Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel > order by in multi_insert cases > > > Key: HIVE-16600 > URL: https://issues.apache.org/jira/browse/HIVE-16600 > Project: Hive > Issue Type: Sub-task >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-16600.1.patch, HIVE-16600.2.patch, > HIVE-16600.3.patch, HIVE-16600.4.patch, HIVE-16600.5.patch, > HIVE-16600.6.patch, HIVE-16600.7.patch, HIVE-16600.8.patch, > HIVE-16600.9.patch, mr.explain, mr.explain.log.HIVE-16600 > > > multi_insert_gby.case.q > {code} > set hive.exec.reducers.bytes.per.reducer=256; > set hive.optimize.sampling.orderby=true; > drop table if exists e1; > drop table if exists e2; > create table e1 (key string, value string); > create table e2 (key string); > FROM (select key, cast(key as double) as keyD, value from src order by key) a > INSERT OVERWRITE TABLE e1 > SELECT key, value > INSERT OVERWRITE TABLE e2 > SELECT key; > select * from e1; > select * from e2; > {code} > the parallelism of Sort is 1 even we enable parallel order > by("hive.optimize.sampling.orderby" is set as "true"). This is not > reasonable because the parallelism should be calcuated by > [Utilities.estimateReducers|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L170] > this is because SetSparkReducerParallelism#needSetParallelism returns false > when [children size of > RS|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L207] > is greater than 1. > in this case, the children size of {{RS[2]}} is two. > the logical plan of the case > {code} >TS[0]-SEL[1]-RS[2]-SEL[3]-SEL[4]-FS[5] > -SEL[6]-FS[7] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16743) BitSet set() is incorrectly used in TxnUtils.createValidCompactTxnList()
[ https://issues.apache.org/jira/browse/HIVE-16743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027192#comment-16027192 ] Lefty Leverenz commented on HIVE-16743: --- Thanks for updating errata.txt with HIVE-16760, [~wzheng]. > BitSet set() is incorrectly used in TxnUtils.createValidCompactTxnList() > > > Key: HIVE-16743 > URL: https://issues.apache.org/jira/browse/HIVE-16743 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Fix For: 3.0.0 > > Attachments: HIVE-16743.1.patch > > > The second line is problematic > {code} > BitSet bitSet = new BitSet(exceptions.length); > bitSet.set(0, bitSet.length()); // for ValidCompactorTxnList, everything > in exceptions are aborted > {code} > For example, exceptions' length is 2. We declare a BitSet object with initial > size of 2 via the first line above. But that's not the actual size of the > BitSet. So bitSet.length() will still return 0. > The intention of the second line above is to set all the bits to true. This > was not achieved because bitSet.set(0, bitSet.length()) is equivalent to > bitSet.set(0, 0). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027191#comment-16027191 ] Hive QA commented on HIVE-16779: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870179/HIVE-16779.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10784 tests executed *Failed tests:* {noformat} TestThriftCLIServiceWithBinary - did not produce a TEST-*.xml file (likely timed out) (batchId=222) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5454/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5454/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5454/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870179 - PreCommit-HIVE-Build > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch > > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
[ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027188#comment-16027188 ] Lefty Leverenz commented on HIVE-11531: --- Yes please, [~dmarkovitz] -- thanks for offering. > Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise > - > > Key: HIVE-11531 > URL: https://issues.apache.org/jira/browse/HIVE-11531 > Project: Hive > Issue Type: Improvement > Components: CBO >Reporter: Sergey Shelukhin >Assignee: Hui Zheng > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11531.02.patch, HIVE-11531.03.patch, > HIVE-11531.04.patch, HIVE-11531.05.patch, HIVE-11531.06.patch, > HIVE-11531.07.patch, HIVE-11531.patch, HIVE-11531.WIP.1.patch, > HIVE-11531.WIP.2.patch > > > For any UIs that involve pagination, it is useful to issue queries in the > form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be > paginated (which can be extremely large by itself). At present, ROW_NUMBER > can be used to achieve this effect, but optimizations for LIMIT such as TopN > in ReduceSink do not apply to ROW_NUMBER. We can add first class support for > "skip" to existing limit, or improve ROW_NUMBER for better performance -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used
[ https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027170#comment-16027170 ] Hive QA commented on HIVE-16777: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870164/HIVE-16777.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5453/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5453/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5453/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870164 - PreCommit-HIVE-Build > LLAP: Use separate tokens and UGI instances when an external client is used > --- > > Key: HIVE-16777 > URL: https://issues.apache.org/jira/browse/HIVE-16777 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16777.01.patch > > > Otherwise leads to errors since the token is shared, and there's different > nodes running Umbilical. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027155#comment-16027155 ] Hive QA commented on HIVE-16589: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870161/HIVE-16589.094.patch {color:green}SUCCESS:{color} +1 due to 27 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10787 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] (batchId=53) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_12] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_15] (batchId=61) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_reduce] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_12] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_distinct_gby] (batchId=158) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_12] (batchId=104) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_15] (batchId=127) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress] (batchId=120) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5452/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5452/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5452/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870161 - PreCommit-HIVE-Build > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16600) Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in multi_insert cases
[ https://issues.apache.org/jira/browse/HIVE-16600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027139#comment-16027139 ] liyunzhang_intel commented on HIVE-16600: - [~lirui]: the algorithms you provided seems ok except 1 point here op has branches means op.getChildOperators().size()>1? if there are more than 1 child, parallel order is not enabled if it is a non multi-insert case? I don't know whether there is a non multi insert order case which contains an operator which has more than 1 child or not. {code} RS rs; Operator op=rs; while(op!=null){ if(op instanceof LIM){ return false; } if((op instanceof RS && op!=rs) || op instanceof FS){ return true; } if(op has branches){ return isMultiInsert; } op=op.child; } return true; {code} > Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel > order by in multi_insert cases > > > Key: HIVE-16600 > URL: https://issues.apache.org/jira/browse/HIVE-16600 > Project: Hive > Issue Type: Sub-task >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Attachments: HIVE-16600.1.patch, HIVE-16600.2.patch, > HIVE-16600.3.patch, HIVE-16600.4.patch, HIVE-16600.5.patch, > HIVE-16600.6.patch, HIVE-16600.7.patch, HIVE-16600.8.patch, > HIVE-16600.9.patch, mr.explain, mr.explain.log.HIVE-16600 > > > multi_insert_gby.case.q > {code} > set hive.exec.reducers.bytes.per.reducer=256; > set hive.optimize.sampling.orderby=true; > drop table if exists e1; > drop table if exists e2; > create table e1 (key string, value string); > create table e2 (key string); > FROM (select key, cast(key as double) as keyD, value from src order by key) a > INSERT OVERWRITE TABLE e1 > SELECT key, value > INSERT OVERWRITE TABLE e2 > SELECT key; > select * from e1; > select * from e2; > {code} > the parallelism of Sort is 1 even we enable parallel order > by("hive.optimize.sampling.orderby" is set as "true"). This is not > reasonable because the parallelism should be calcuated by > [Utilities.estimateReducers|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L170] > this is because SetSparkReducerParallelism#needSetParallelism returns false > when [children size of > RS|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SetSparkReducerParallelism.java#L207] > is greater than 1. > in this case, the children size of {{RS[2]}} is two. > the logical plan of the case > {code} >TS[0]-SEL[1]-RS[2]-SEL[3]-SEL[4]-FS[5] > -SEL[6]-FS[7] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
[ https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16654: --- Status: Open (was: Patch Available) > Optimize a combination of avg(), sum(), count(distinct) etc > --- > > Key: HIVE-16654 > URL: https://issues.apache.org/jira/browse/HIVE-16654 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, > HIVE-16654.03.patch, HIVE-16654.04.patch > > > an example rewrite for q28 of tpcds is > {code} > (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD > from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as > CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from > store_sales where > ss_list_price is not null and ss_quantity between 0 and 5 > and (ss_list_price between 11 and 11+10 > or ss_coupon_amt between 460 and 460+1000 > or ss_wholesale_cost between 14 and 14+20) > group by ss_list_price) ss0) ss1) B1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
[ https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16654: --- Attachment: HIVE-16654.04.patch > Optimize a combination of avg(), sum(), count(distinct) etc > --- > > Key: HIVE-16654 > URL: https://issues.apache.org/jira/browse/HIVE-16654 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, > HIVE-16654.03.patch, HIVE-16654.04.patch > > > an example rewrite for q28 of tpcds is > {code} > (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD > from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as > CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from > store_sales where > ss_list_price is not null and ss_quantity between 0 and 5 > and (ss_list_price between 11 and 11+10 > or ss_coupon_amt between 460 and 460+1000 > or ss_wholesale_cost between 14 and 14+20) > group by ss_list_price) ss0) ss1) B1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
[ https://issues.apache.org/jira/browse/HIVE-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16654: --- Status: Patch Available (was: Open) > Optimize a combination of avg(), sum(), count(distinct) etc > --- > > Key: HIVE-16654 > URL: https://issues.apache.org/jira/browse/HIVE-16654 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16654.01.patch, HIVE-16654.02.patch, > HIVE-16654.03.patch, HIVE-16654.04.patch > > > an example rewrite for q28 of tpcds is > {code} > (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD > from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as > CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from > store_sales where > ss_list_price is not null and ss_quantity between 0 and 5 > and (ss_list_price between 11 and 11+10 > or ss_coupon_amt between 460 and 460+1000 > or ss_wholesale_cost between 14 and 14+20) > group by ss_list_price) ss0) ss1) B1 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16779: -- Attachment: (was: HIVE-16779.2.patch) > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch > > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027115#comment-16027115 ] Thejas M Nair commented on HIVE-16779: -- +1 minor nit : LOG.error("Error shutting down RawStore",e ); can you add a space before "e" and remove space after e :) > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch > > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16779: -- Attachment: HIVE-16779.2.patch > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch > > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027114#comment-16027114 ] Hive QA commented on HIVE-16771: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870160/HIVE-16771.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udtf_replicate_rows] (batchId=79) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5451/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5451/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5451/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870160 - PreCommit-HIVE-Build > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch, HIVE-16771.02.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16779: -- Attachment: HIVE-16779.2.patch Sounds good. Attach HIVE-16779.2.patch. > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16779.1.patch, HIVE-16779.2.patch > > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027109#comment-16027109 ] Thejas M Nair commented on HIVE-16779: -- Add it to finally block ? {code} finally { try { rawStore.shutdown(); } catch (Exception e){ LOG.error("Error shutting down RawStore",e ); } } {code} > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16779.1.patch > > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16761) LLAP IO: SMB joins fail elevator
[ https://issues.apache.org/jira/browse/HIVE-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027103#comment-16027103 ] Sergey Shelukhin commented on HIVE-16761: - The call is in next {noformat} nextValue(batch.cols[i], rowInBatch, schema.get(i), getStructCol(value, i))) {noformat} Schema is created from the vrbCtx {noformat} schema = Lists.newArrayList(vrbCtx.getRowColumnTypeInfos()); {noformat} The ctx is the same one passed from the LlapReader... created via "LlapInputFormat.createFakeVrbCtx(mapWork);" for the non-vectorized map work case, as I assume is the case here. I suspect the problem is that the latter is incorrect for this case. {noformat} static VectorizedRowBatchCtx createFakeVrbCtx(MapWork mapWork) throws HiveException { // This is based on Vectorizer code, minus the validation. // Add all non-virtual columns from the TableScan operator. RowSchema rowSchema = findTsOp(mapWork).getSchema(); final List colNames = new ArrayList(rowSchema.getSignature().size()); final List colTypes = new ArrayList(rowSchema.getSignature().size()); for (ColumnInfo c : rowSchema.getSignature()) { String columnName = c.getInternalName(); if (VirtualColumn.VIRTUAL_COLUMN_NAMES.contains(columnName)) continue; colNames.add(columnName); colTypes.add(TypeInfoUtils.getTypeInfoFromTypeString(c.getTypeName())); } // Determine the partition columns using the first partition descriptor. // Note - like vectorizer, this assumes partition columns go after data columns. int partitionColumnCount = 0; Iterator paths = mapWork.getPathToAliases().keySet().iterator(); if (paths.hasNext()) { PartitionDesc partDesc = mapWork.getPathToPartitionInfo().get(paths.next()); if (partDesc != null) { LinkedHashMappartSpec = partDesc.getPartSpec(); if (partSpec != null && partSpec.isEmpty()) { partitionColumnCount = partSpec.size(); } } } return new VectorizedRowBatchCtx(colNames.toArray(new String[colNames.size()]), colTypes.toArray(new TypeInfo[colTypes.size()]), null, partitionColumnCount, new String[0]); } {noformat} [~jdere] [~gopalv] does SMB join do something special wrt columns? Also, I see a bug right there with partition column count. I wonder if that could be related... > LLAP IO: SMB joins fail elevator > - > > Key: HIVE-16761 > URL: https://issues.apache.org/jira/browse/HIVE-16761 > Project: Hive > Issue Type: Bug >Reporter: Gopal V > > {code} > Caused by: java.io.IOException: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:153) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:78) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 26 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextString(BatchToRowReader.java:334) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextValue(BatchToRowReader.java:602) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:149) > ... 28 more > {code} > {code} > set hive.enforce.sortmergebucketmapjoin=false; > set hive.optimize.bucketmapjoin=true; > set hive.optimize.bucketmapjoin.sortedmerge=true; > set hive.auto.convert.sortmerge.join=true; > set hive.auto.convert.join=true; > set hive.auto.convert.join.noconditionaltask.size=500; > select year,quarter,count(*) from transactions_raw_orc_200 a join > customer_accounts_orc_200 b on a.account_id=b.account_id group by > year,quarter; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16778: Attachment: HIVE-16778.patch > LLAP IO: better refcount management > --- > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16778.patch, HIVE-16778.patch > > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16778) LLAP IO: better refcount management
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027079#comment-16027079 ] Sergey Shelukhin commented on HIVE-16778: - [~prasanth_j] can you take a look? https://reviews.apache.org/r/59615/ This improves error handling for refcounts, and also removes the per-column RG arrays (that are the legacy of high-level cache). > LLAP IO: better refcount management > --- > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16778.patch > > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16778: Status: Patch Available (was: Open) > LLAP IO: better refcount management > --- > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16778.patch > > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16778: Attachment: HIVE-16778.patch > LLAP IO: better refcount management > --- > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-16778.patch > > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16778: Summary: LLAP IO: better refcount management (was: LLAP IO: better refcount management I) > LLAP IO: better refcount management > --- > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage
[ https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027072#comment-16027072 ] Hive QA commented on HIVE-15665: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870156/HIVE-15665.02.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10793 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=237) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_17] (batchId=82) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5450/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5450/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5450/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870156 - PreCommit-HIVE-Build > LLAP: OrcFileMetadata objects in cache can impact heap usage > > > Key: HIVE-15665 > URL: https://issues.apache.org/jira/browse/HIVE-15665 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, > HIVE-15665.patch > > > OrcFileMetadata internally has filestats, stripestats etc which are allocated > in heap. On large data sets, this could have an impact on the heap usage and > the memory usage by different executors in LLAP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak
[ https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Ma updated HIVE-16765: Attachment: HIVE-16765-branch-2.3.patch Upload patch for branch-2.3. > ParquetFileReader should be closed to avoid resource leak > - > > Key: HIVE-16765 > URL: https://issues.apache.org/jira/browse/HIVE-16765 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Colin Ma >Assignee: Colin Ma >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-16765.001.patch, HIVE-16765-branch-2.3.patch > > > ParquetFileReader should be closed to avoid resource leak -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16779: -- Status: Patch Available (was: Open) > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16779.1.patch > > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used
[ https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027062#comment-16027062 ] Siddharth Seth commented on HIVE-16777: --- True. Multiple across hosts would need to be handled. The client needs to stop creating an umbilical per fragment. > LLAP: Use separate tokens and UGI instances when an external client is used > --- > > Key: HIVE-16777 > URL: https://issues.apache.org/jira/browse/HIVE-16777 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16777.01.patch > > > Otherwise leads to errors since the token is shared, and there's different > nodes running Umbilical. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
[ https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027051#comment-16027051 ] Daniel Dai commented on HIVE-16323: --- Unit tests pass. > HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204 > --- > > Key: HIVE-16323 > URL: https://issues.apache.org/jira/browse/HIVE-16323 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16323.1.patch, HIVE-16323.2.patch, PM_leak.png > > > Hive.loadDynamicPartitions creates threads with new embedded rawstore, but > never close them, thus we leak PersistenceManager one per such thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used
[ https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027050#comment-16027050 ] Sergey Shelukhin edited comment on HIVE-16777 at 5/27/17 12:08 AM: --- +1 as a quick fix... however 1) it would be better to add field to the protocol promising single AM (or not, as in case with external interface); relying on external flag itself is hacky since, potentially, other clients could use external record reader from a single "AM", allowing them to utilize the UGI pool. 2) also, the comment might not be correct w.r.t. it being temporary; even if Spark thing uses a single port per instance, it can still have multiple spark tasks for the same query as far as I understand. That would make this solution permanent :) was (Author: sershe): +1 as a quick fix... however 1) it would be better to add field to the protocol promising single AM (or not, as in case with external interface); relying on external flag itself is hacky. 2) also, the comment might not be correct w.r.t. it being temporary; even if Spark thing uses a single port per instance, it can still have multiple spark tasks for the same query as far as I understand. That would make this solution permanent :) > LLAP: Use separate tokens and UGI instances when an external client is used > --- > > Key: HIVE-16777 > URL: https://issues.apache.org/jira/browse/HIVE-16777 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16777.01.patch > > > Otherwise leads to errors since the token is shared, and there's different > nodes running Umbilical. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used
[ https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027050#comment-16027050 ] Sergey Shelukhin commented on HIVE-16777: - +1 as a quick fix... however 1) it would be better to add field to the protocol promising single AM (or not, as in case with external interface); relying on external flag itself is hacky. 2) also, the comment might not be correct w.r.t. it being temporary; even if Spark thing uses a single port per instance, it can still have multiple spark tasks for the same query as far as I understand. That would make this solution permanent :) > LLAP: Use separate tokens and UGI instances when an external client is used > --- > > Key: HIVE-16777 > URL: https://issues.apache.org/jira/browse/HIVE-16777 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16777.01.patch > > > Otherwise leads to errors since the token is shared, and there's different > nodes running Umbilical. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16779: -- Attachment: HIVE-16779.1.patch > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16779.1.patch > > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
[ https://issues.apache.org/jira/browse/HIVE-16779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai reassigned HIVE-16779: - > CachedStore refresher leak PersistenceManager resources > --- > > Key: HIVE-16779 > URL: https://issues.apache.org/jira/browse/HIVE-16779 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Daniel Dai >Assignee: Daniel Dai > > See OOM when running CachedStore. We didn't shutdown rawstore in refresh > thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16778) LLAP IO: better refcount management I
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-16778: --- > LLAP IO: better refcount management I > - > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak
[ https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027045#comment-16027045 ] Pengcheng Xiong commented on HIVE-16765: sure. please run ptest before you commit. thanks! > ParquetFileReader should be closed to avoid resource leak > - > > Key: HIVE-16765 > URL: https://issues.apache.org/jira/browse/HIVE-16765 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Colin Ma >Assignee: Colin Ma >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-16765.001.patch > > > ParquetFileReader should be closed to avoid resource leak -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak
[ https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027040#comment-16027040 ] Ferdinand Xu commented on HIVE-16765: - Hi [~pxiong] can we make this committed to branch 2.3? It's important to the feature Parquet vectorization. Thank you! > ParquetFileReader should be closed to avoid resource leak > - > > Key: HIVE-16765 > URL: https://issues.apache.org/jira/browse/HIVE-16765 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Colin Ma >Assignee: Colin Ma >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-16765.001.patch > > > ParquetFileReader should be closed to avoid resource leak -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used
[ https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-16777: -- Target Version/s: 3.0.0 Status: Patch Available (was: Open) > LLAP: Use separate tokens and UGI instances when an external client is used > --- > > Key: HIVE-16777 > URL: https://issues.apache.org/jira/browse/HIVE-16777 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16777.01.patch > > > Otherwise leads to errors since the token is shared, and there's different > nodes running Umbilical. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used
[ https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-16777: -- Attachment: HIVE-16777.01.patch cc [~sershe] for review. > LLAP: Use separate tokens and UGI instances when an external client is used > --- > > Key: HIVE-16777 > URL: https://issues.apache.org/jira/browse/HIVE-16777 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-16777.01.patch > > > Otherwise leads to errors since the token is shared, and there's different > nodes running Umbilical. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak
[ https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-16765: Affects Version/s: 3.0.0 > ParquetFileReader should be closed to avoid resource leak > - > > Key: HIVE-16765 > URL: https://issues.apache.org/jira/browse/HIVE-16765 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Colin Ma >Assignee: Colin Ma >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-16765.001.patch > > > ParquetFileReader should be closed to avoid resource leak -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
[ https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027034#comment-16027034 ] Hive QA commented on HIVE-16323: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870134/HIVE-16323.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5448/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5448/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5448/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870134 - PreCommit-HIVE-Build > HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204 > --- > > Key: HIVE-16323 > URL: https://issues.apache.org/jira/browse/HIVE-16323 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16323.1.patch, HIVE-16323.2.patch, PM_leak.png > > > Hive.loadDynamicPartitions creates threads with new embedded rawstore, but > never close them, thus we leak PersistenceManager one per such thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16777) LLAP: Use separate tokens and UGI instances when an external client is used
[ https://issues.apache.org/jira/browse/HIVE-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth reassigned HIVE-16777: - > LLAP: Use separate tokens and UGI instances when an external client is used > --- > > Key: HIVE-16777 > URL: https://issues.apache.org/jira/browse/HIVE-16777 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Siddharth Seth > > Otherwise leads to errors since the token is shared, and there's different > nodes running Umbilical. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16765) ParquetFileReader should be closed to avoid resource leak
[ https://issues.apache.org/jira/browse/HIVE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-16765: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to the upstream. Thx for the contribution. > ParquetFileReader should be closed to avoid resource leak > - > > Key: HIVE-16765 > URL: https://issues.apache.org/jira/browse/HIVE-16765 > Project: Hive > Issue Type: Sub-task >Reporter: Colin Ma >Assignee: Colin Ma >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-16765.001.patch > > > ParquetFileReader should be closed to avoid resource leak -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Status: Patch Available (was: In Progress) > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-16771: --- Attachment: HIVE-16771.02.patch Attached a second version of the patch based on my discussion with Naveen. This version passes the connection information needed to the method instead of the connection object. > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch, HIVE-16771.02.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269
[ https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HIVE-16549. -- Resolution: Fixed > Fix an incompatible change in PredicateLeafImpl from HIVE-15269 > --- > > Key: HIVE-16549 > URL: https://issues.apache.org/jira/browse/HIVE-16549 > Project: Hive > Issue Type: Bug > Components: storage-api >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > Attachments: HIVE-16549.patch > > > HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a > configuration object. The configuration object is only used for the new > LiteralDelegates. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Attachment: HIVE-16589.094.patch > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Attachment: (was: HIVE-16589.094.patch) > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026993#comment-16026993 ] Sergio Peña commented on HIVE-16771: I agree with [~ngangam] that passing a Connection object to the interface doesn't guarantee that the implementation will close the connection. But if this is a helper class, shouldn't we allow that contract? > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16285) Servlet for dynamically configuring log levels
[ https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16285: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Test failures are unrelated. Committed to master. Thanks Gopal for the review! > Servlet for dynamically configuring log levels > -- > > Key: HIVE-16285 > URL: https://issues.apache.org/jira/browse/HIVE-16285 > Project: Hive > Issue Type: Improvement > Components: Logging >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 3.0.0 > > Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, > HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, > HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch, HIVE-16285.7.patch > > > Many long running services like HS2, LLAP etc. will benefit from having an > endpoint to dynamically change log levels for various loggers. This will help > greatly with debuggability without requiring a restart of the service. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16285) Servlet for dynamically configuring log levels
[ https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026954#comment-16026954 ] Hive QA commented on HIVE-16285: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870144/HIVE-16285.6.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5446/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5446/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5446/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870144 - PreCommit-HIVE-Build > Servlet for dynamically configuring log levels > -- > > Key: HIVE-16285 > URL: https://issues.apache.org/jira/browse/HIVE-16285 > Project: Hive > Issue Type: Improvement > Components: Logging >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, > HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, > HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch, HIVE-16285.7.patch > > > Many long running services like HS2, LLAP etc. will benefit from having an > endpoint to dynamically change log levels for various loggers. This will help > greatly with debuggability without requiring a restart of the service. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage
[ https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15665: Attachment: HIVE-15665.02.patch Fixing some small refcount issues > LLAP: OrcFileMetadata objects in cache can impact heap usage > > > Key: HIVE-15665 > URL: https://issues.apache.org/jira/browse/HIVE-15665 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, > HIVE-15665.patch > > > OrcFileMetadata internally has filestats, stripestats etc which are allocated > in heap. On large data sets, this could have an impact on the heap usage and > the memory usage by different executors in LLAP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16285) Servlet for dynamically configuring log levels
[ https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16285: - Attachment: HIVE-16285.7.patch brought back unrelated changes. > Servlet for dynamically configuring log levels > -- > > Key: HIVE-16285 > URL: https://issues.apache.org/jira/browse/HIVE-16285 > Project: Hive > Issue Type: Improvement > Components: Logging >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, > HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, > HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch, HIVE-16285.7.patch > > > Many long running services like HS2, LLAP etc. will benefit from having an > endpoint to dynamically change log levels for various loggers. This will help > greatly with debuggability without requiring a restart of the service. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Attachment: HIVE-16589.094.patch > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Status: In Progress (was: Patch Available) > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16285) Servlet for dynamically configuring log levels
[ https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026909#comment-16026909 ] Gopal V commented on HIVE-16285: The patch removes several unrelated fields from classes - that is probably a bad idea to do in a functional patch which has nothing to do with those fields (isOuterJoin for instance). Other than that, LGTM - +1. > Servlet for dynamically configuring log levels > -- > > Key: HIVE-16285 > URL: https://issues.apache.org/jira/browse/HIVE-16285 > Project: Hive > Issue Type: Improvement > Components: Logging >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, > HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, > HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch > > > Many long running services like HS2, LLAP etc. will benefit from having an > endpoint to dynamically change log levels for various loggers. This will help > greatly with debuggability without requiring a restart of the service. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-14990) run all tests for MM tables and fix the issues that are found
[ https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-14990: - Attachment: HIVE-14990.19.patch Upload patch 19 for testing. The reason why there were ~4000 tests not being run previously is due to setup phase failure in q_test_init.sql. LOAD command failed with {code} FAILED: SemanticException [Error 10265]: This command is not allowed on an ACID table default.src with a non-ACID transaction manager. {code} Adding the txn manager settings and try again. > run all tests for MM tables and fix the issues that are found > - > > Key: HIVE-14990 > URL: https://issues.apache.org/jira/browse/HIVE-14990 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Wei Zheng > Fix For: hive-14535 > > Attachments: HIVE-14990.01.patch, HIVE-14990.02.patch, > HIVE-14990.03.patch, HIVE-14990.04.patch, HIVE-14990.04.patch, > HIVE-14990.05.patch, HIVE-14990.05.patch, HIVE-14990.06.patch, > HIVE-14990.06.patch, HIVE-14990.07.patch, HIVE-14990.08.patch, > HIVE-14990.09.patch, HIVE-14990.10.patch, HIVE-14990.10.patch, > HIVE-14990.10.patch, HIVE-14990.12.patch, HIVE-14990.13.patch, > HIVE-14990.14.patch, HIVE-14990.15.patch, HIVE-14990.16.patch, > HIVE-14990.17.patch, HIVE-14990.18.patch, HIVE-14990.19.patch, > HIVE-14990.patch > > > I am running the tests with isMmTable returning true for most tables (except > ACID, temporary tables, views, etc.). > Many tests will fail because of various expected issues with such an > approach; however we can find issues in MM tables from other failures. > Expected failures > 1) All HCat tests (cannot write MM tables via the HCat writer) > 2) Almost all merge tests (alter .. concat is not supported). > 3) Tests that run dfs commands with specific paths (path changes). > 4) Truncate column (not supported). > 5) Describe formatted will have the new table fields in the output (before > merging MM with ACID). > 6) Many tests w/explain extended - diff in partition "base file name" (path > changes). > 7) TestTxnCommands - all the conversion tests, as they check for bucket count > using file lists (path changes). > 8) HBase metastore tests cause methods are not implemented. > 9) Some load and ExIm tests that export a table and then rely on specific > path for load (path changes). > 10) Bucket map join/etc. - diffs; disabled the optimization for MM tables due > to how it accounts for buckets > 11) rand - different results due to different sequence of processing. > 12) many (not all i.e. not the ones with just one insert) tests that have > stats output, such as file count, for obvious reasons > 13) materialized views, not handled by design - the test check erroneously > makes them "mm", no easy way to tell them apart, I don't want to plumb more > stuff thru just for this test > I'm filing jiras for some test failures that are not obvious and need an > investigation later -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16285) Servlet for dynamically configuring log levels
[ https://issues.apache.org/jira/browse/HIVE-16285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16285: - Attachment: HIVE-16285.6.patch Rebased. [~gopalv] can you please take a look? > Servlet for dynamically configuring log levels > -- > > Key: HIVE-16285 > URL: https://issues.apache.org/jira/browse/HIVE-16285 > Project: Hive > Issue Type: Improvement > Components: Logging >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-16285.1.patch, HIVE-16285.2.patch, > HIVE-16285.3.patch, HIVE-16285.4.patch, HIVE-16285.5.patch, > HIVE-16285.5.patch, HIVE-16285.6.patch, HIVE-16285.6.patch > > > Many long running services like HS2, LLAP etc. will benefit from having an > endpoint to dynamically change log levels for various loggers. This will help > greatly with debuggability without requiring a restart of the service. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16343) LLAP: Publish YARN's ProcFs based memory usage to metrics for monitoring
[ https://issues.apache.org/jira/browse/HIVE-16343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16343: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Will create a follow up in case of any perf issues. Committed to master. Thanks Sid for the reviews! > LLAP: Publish YARN's ProcFs based memory usage to metrics for monitoring > > > Key: HIVE-16343 > URL: https://issues.apache.org/jira/browse/HIVE-16343 > Project: Hive > Issue Type: Improvement > Components: llap >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 3.0.0 > > Attachments: HIVE-16343.1.patch, HIVE-16343.2.patch > > > Publish MemInfo from ProcfsBasedProcessTree to llap metrics. This will useful > for monitoring and also setting up triggers via JMC. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
[ https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026858#comment-16026858 ] Hive QA commented on HIVE-16323: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870134/HIVE-16323.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_11] (batchId=237) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5445/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5445/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5445/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870134 - PreCommit-HIVE-Build > HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204 > --- > > Key: HIVE-16323 > URL: https://issues.apache.org/jira/browse/HIVE-16323 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16323.1.patch, HIVE-16323.2.patch, PM_leak.png > > > Hive.loadDynamicPartitions creates threads with new embedded rawstore, but > never close them, thus we leak PersistenceManager one per such thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026792#comment-16026792 ] Vihang Karajgaonkar commented on HIVE-16771: Thanks for the review [~ngangam] My comments inline below. bq. 1) However, it seems a bit odd to have it take a boolean to determine if the query needs quotes or not. Can the implementation detect it without a whole lot of code duplication? The impl should be able to determine the DBTYPE just as easily. In order to fetch schema version from DB without providing the connection object, it will need to create its own connection. The information needed and the utility class {{HiveSchemaHelper}} is in the BeeLine module and cannot be used in Metastore module. bq. 2) The connection and the statement are not closed. This is will certainly cause a memory leak and a potentially a connection leak to the DB. This is just a refactor of the existing code so the connection leak possbility was pre-existing. Let me update the patch to fix it. bq. 3) Same with the need to have an active SQL connection passed in. But then is there a better means to do this? We can either pass the username, password, url, driver to the class to make its own connection or just pass the row information from the Version table. Let me know if you have any better ideas. bq. 4) Ideally, the HMS schema version should only be fetched from DB just once. This implementation fetches it every time. Are there scenarios where the value would be changed after initialization that makes it necessary every time? HiveSchemaTool calls it multiple times eg. before upgrade and after upgrade to validate if the schema is correct after the update. Not sure if we can cache this information due to this reason. > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026793#comment-16026793 ] Vihang Karajgaonkar commented on HIVE-16771: [~spena] [~stakiar] Can you please take a look as well? > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
[ https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-16323: -- Attachment: HIVE-16323.2.patch Not sure why ptest is teting PM_leak.png. Reattach the patch with a later timestamp. > HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204 > --- > > Key: HIVE-16323 > URL: https://issues.apache.org/jira/browse/HIVE-16323 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16323.1.patch, HIVE-16323.2.patch, PM_leak.png > > > Hive.loadDynamicPartitions creates threads with new embedded rawstore, but > never close them, thus we leak PersistenceManager one per such thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16644) Hook Change Manager to Insert Overwrite
[ https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026761#comment-16026761 ] Hive QA commented on HIVE-16644: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870124/HIVE-16644.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 10790 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5444/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5444/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5444/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870124 - PreCommit-HIVE-Build > Hook Change Manager to Insert Overwrite > --- > > Key: HIVE-16644 > URL: https://issues.apache.org/jira/browse/HIVE-16644 > Project: Hive > Issue Type: Sub-task > Components: Hive, repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Attachments: HIVE-16644.01.patch > > > For insert overwrite Hive.replaceFiles is called to replace contents of > existing partitions/table. This should trigger move of old files into $CMROOT. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16745) Syntax error in 041-HIVE-16556.mysql.sql script
[ https://issues.apache.org/jira/browse/HIVE-16745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-16745: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Fix has been committed to master for 3.0.0. > Syntax error in 041-HIVE-16556.mysql.sql script > --- > > Key: HIVE-16745 > URL: https://issues.apache.org/jira/browse/HIVE-16745 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16745.01.patch > > > 041-HIVE-16556.mysql.sql has a syntax error which was introduced with > HIVE-16711 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16767) Update people website with recent changes
[ https://issues.apache.org/jira/browse/HIVE-16767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026696#comment-16026696 ] Yongzhi Chen commented on HIVE-16767: - Mine is right. Thanks > Update people website with recent changes > - > > Key: HIVE-16767 > URL: https://issues.apache.org/jira/browse/HIVE-16767 > Project: Hive > Issue Type: Task > Components: Documentation >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-16767.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16644) Hook Change Manager to Insert Overwrite
[ https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-16644: Status: Patch Available (was: In Progress) > Hook Change Manager to Insert Overwrite > --- > > Key: HIVE-16644 > URL: https://issues.apache.org/jira/browse/HIVE-16644 > Project: Hive > Issue Type: Sub-task > Components: Hive, repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Attachments: HIVE-16644.01.patch > > > For insert overwrite Hive.replaceFiles is called to replace contents of > existing partitions/table. This should trigger move of old files into $CMROOT. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16644) Hook Change Manager to Insert Overwrite
[ https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-16644: Component/s: Hive > Hook Change Manager to Insert Overwrite > --- > > Key: HIVE-16644 > URL: https://issues.apache.org/jira/browse/HIVE-16644 > Project: Hive > Issue Type: Sub-task > Components: Hive, repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Attachments: HIVE-16644.01.patch > > > For insert overwrite Hive.replaceFiles is called to replace contents of > existing partitions/table. This should trigger move of old files into $CMROOT. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16644) Hook Change Manager to Insert Overwrite
[ https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-16644: Description: For insert overwrite Hive.replaceFiles is called to replace contents of existing partitions/table. This should trigger move of old files into $CMROOT. (was: For insert overwrite Hive.moveFile is called to replace contents of existing partitions. This should trigger move of old files into $CMROOT.) > Hook Change Manager to Insert Overwrite > --- > > Key: HIVE-16644 > URL: https://issues.apache.org/jira/browse/HIVE-16644 > Project: Hive > Issue Type: Sub-task > Components: repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Attachments: HIVE-16644.01.patch > > > For insert overwrite Hive.replaceFiles is called to replace contents of > existing partitions/table. This should trigger move of old files into $CMROOT. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16644) Hook Change Manager to Insert Overwrite
[ https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026678#comment-16026678 ] Sankar Hariappan edited comment on HIVE-16644 at 5/26/17 6:59 PM: -- Added 01.patch with below changes. - Added a metastore api cm_recycle to move files to CM path before trashing it. - The destination directory is recycled in Hive.replaceFIles before trashing it. - For insert overwrite case, oldPath and destf inputs to Hive.replaceFiles will be same and the trashing of the files in this directory happens in Hive.replaceFiles. So, no handling required in Hive.moveFile. Request [~anishek] to review the patch! cc [~thejas],[~sushanth] was (Author: sankarh): Added 01.patch with below changes. - Added a metastore api cm_recycle to move files to CM path before trashing it. - The destination directory is recycled in Hive.replaceFIles before trashing it. - For insert overwrite case, oldPath and destf inputs to Hive.replaceFiles will be same and the trashing of the files in this directory happens in Hive.replaceFiles. So, no handling required in Hive.moveFile. > Hook Change Manager to Insert Overwrite > --- > > Key: HIVE-16644 > URL: https://issues.apache.org/jira/browse/HIVE-16644 > Project: Hive > Issue Type: Sub-task > Components: repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Attachments: HIVE-16644.01.patch > > > For insert overwrite Hive.moveFile is called to replace contents of existing > partitions. This should trigger move of old files into $CMROOT. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16644) Hook Change Manager to Insert Overwrite
[ https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan updated HIVE-16644: Attachment: HIVE-16644.01.patch Added 01.patch with below changes. - Added a metastore api cm_recycle to move files to CM path before trashing it. - The destination directory is recycled in Hive.replaceFIles before trashing it. - For insert overwrite case, oldPath and destf inputs to Hive.replaceFiles will be same and the trashing of the files in this directory happens in Hive.replaceFiles. So, no handling required in Hive.moveFile. > Hook Change Manager to Insert Overwrite > --- > > Key: HIVE-16644 > URL: https://issues.apache.org/jira/browse/HIVE-16644 > Project: Hive > Issue Type: Sub-task > Components: repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Attachments: HIVE-16644.01.patch > > > For insert overwrite Hive.moveFile is called to replace contents of existing > partitions. This should trigger move of old files into $CMROOT. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16764) Support numeric as same as decimal
[ https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026672#comment-16026672 ] Hive QA commented on HIVE-16764: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870103/HIVE-16764.02.patch {color:green}SUCCESS:{color} +1 due to 92 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10801 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=232) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5443/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5443/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5443/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870103 - PreCommit-HIVE-Build > Support numeric as same as decimal > -- > > Key: HIVE-16764 > URL: https://issues.apache.org/jira/browse/HIVE-16764 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch > > > for example numeric(12,2) -> decimal(12,2) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16644) Hook Change Manager to Insert Overwrite
[ https://issues.apache.org/jira/browse/HIVE-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026668#comment-16026668 ] ASF GitHub Bot commented on HIVE-16644: --- GitHub user sankarh opened a pull request: https://github.com/apache/hive/pull/190 HIVE-16644: Hook Change Manager to Insert Overwrite Change management for insert overwrite to a table or a partition. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sankarh/hive HIVE-16644 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/190.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #190 commit 0e7906c2662cceaf490e11698a77fa1cb8fd90cb Author: Sankar HariappanDate: 2017-05-24T13:31:17Z HIVE-16644: Hook Change Manager to Insert Overwrite > Hook Change Manager to Insert Overwrite > --- > > Key: HIVE-16644 > URL: https://issues.apache.org/jira/browse/HIVE-16644 > Project: Hive > Issue Type: Sub-task > Components: repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > > For insert overwrite Hive.moveFile is called to replace contents of existing > partitions. This should trigger move of old files into $CMROOT. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16770) Concatinate is not working on Table/Partial Partition level
[ https://issues.apache.org/jira/browse/HIVE-16770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kallam Reddy updated HIVE-16770: Description: Not able to CONCATENATE at Table/Partial partition levels. I have table test partitioned on year, month and date. If I try to concatenate by providing corresponding year, month and date of partition it is working fine, but when I want to concatenate ORC files for all the sub partition corresponding to year and month, it is giving exception. hive> ALTER TABLE test PARTITION (year = '2017', month = '05') CONCATENATE; FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: Partition not found {year=2017, month=05} hive> ALTER TABLE test CONCATENATE; FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: source table test is partitioned but no partition desc found. I am expecting this to trigger concatenate in all available sub partitions. was: Not able to CONCATENATE at Table/Partial partition levels. I have table test partitioned on year, month and date. If I try to concatenate by providing corresponding year, month and date of partition it is working fine, but when I want to concatenate ORC files for all the sub partition corresponding to year and month, it is giving exception. hive> ALTER TABLE test PARTITION (year = '2017', month = '05') CONCATENATE; FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: Partition not found {year=2017, month=05} hive> ALTER TABLE otda_es_orc_p1 CONCATENATE; FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: source table otdadb.otda_es_orc_p1 is partitioned but no partition desc found. I am expecting this to trigger concatenate in all available sub partitions. > Concatinate is not working on Table/Partial Partition level > > > Key: HIVE-16770 > URL: https://issues.apache.org/jira/browse/HIVE-16770 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1 > Environment: centOS7 >Reporter: Kallam Reddy > > Not able to CONCATENATE at Table/Partial partition levels. I have table test > partitioned on year, month and date. If I try to concatenate by providing > corresponding year, month and date of partition it is working fine, but when > I want to concatenate ORC files for all the sub partition corresponding to > year and month, it is giving exception. > hive> ALTER TABLE test PARTITION (year = '2017', month = '05') CONCATENATE; > FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: > Partition not found {year=2017, month=05} > hive> ALTER TABLE test CONCATENATE; > FAILED: SemanticException org.apache.hadoop.hive.ql.parse.SemanticException: > source table test is partitioned but no partition desc found. > I am expecting this to trigger concatenate in all available sub partitions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16776) Strange cast behavior for table backed by druid
[ https://issues.apache.org/jira/browse/HIVE-16776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-16776: -- > Strange cast behavior for table backed by druid > --- > > Key: HIVE-16776 > URL: https://issues.apache.org/jira/browse/HIVE-16776 > Project: Hive > Issue Type: Bug > Components: Druid integration >Affects Versions: 3.0.0 >Reporter: slim bouguerra >Assignee: Jesus Camacho Rodriguez > > The following query > {code} > explain select SUBSTRING(`Calcs`.`str0`,CAST(`Calcs`.`int2` AS int), 3) from > `druid_tableau`.`calcs` `Calcs`; > OK > Plan not optimized by CBO. > {code} > fails the cbo with the following exception > {code} org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Wrong > arguments '3': No matching method for class > org.apache.hadoop.hive.ql.udf.UDFSubstr with (string, bigint, int). Po > ssible choices: _FUNC_(binary, int) _FUNC_(binary, int, int) _FUNC_(string, > int) _FUNC_(string, int, int) > at > org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1355) > ~[hive-exec-2.1.0.2.6.0.2-SNAPSHOT.jar:2.1.0.2.6.0.2-SNA > PSHOT]{code}. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16706) Bootstrap REPL DUMP shouldn't fail when a partition is dropped/renamed when dump in progress.
[ https://issues.apache.org/jira/browse/HIVE-16706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026635#comment-16026635 ] ASF GitHub Bot commented on HIVE-16706: --- Github user sankarh closed the pull request at: https://github.com/apache/hive/pull/187 > Bootstrap REPL DUMP shouldn't fail when a partition is dropped/renamed when > dump in progress. > - > > Key: HIVE-16706 > URL: https://issues.apache.org/jira/browse/HIVE-16706 > Project: Hive > Issue Type: Sub-task > Components: Hive, repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > Attachments: HIVE-16706.01.patch > > > Currently, bootstrap REPL DUMP gets the partitions in a batch and then > iterate through it. If any partition is dropped/renamed during iteration, it > may lead to failure/exception. In this case, the partition should be skipped > from dump and also need to ensure no failure of REPL DUMP and the subsequent > incremental dump should ensure the consistent state of the table. > This bug is related to HIVE-16684. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
[ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026629#comment-16026629 ] Dudu Markovitz commented on HIVE-11531: --- [~leftylev], It seems the offset feature is not documented. Would you like me to do it? > Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise > - > > Key: HIVE-11531 > URL: https://issues.apache.org/jira/browse/HIVE-11531 > Project: Hive > Issue Type: Improvement > Components: CBO >Reporter: Sergey Shelukhin >Assignee: Hui Zheng > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11531.02.patch, HIVE-11531.03.patch, > HIVE-11531.04.patch, HIVE-11531.05.patch, HIVE-11531.06.patch, > HIVE-11531.07.patch, HIVE-11531.patch, HIVE-11531.WIP.1.patch, > HIVE-11531.WIP.2.patch > > > For any UIs that involve pagination, it is useful to issue queries in the > form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be > paginated (which can be extremely large by itself). At present, ROW_NUMBER > can be used to achieve this effect, but optimizations for LIMIT such as TopN > in ReduceSink do not apply to ROW_NUMBER. We can add first class support for > "skip" to existing limit, or improve ROW_NUMBER for better performance -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269
[ https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026581#comment-16026581 ] Prasanth Jayachandran commented on HIVE-16549: -- +1 on the patch > Fix an incompatible change in PredicateLeafImpl from HIVE-15269 > --- > > Key: HIVE-16549 > URL: https://issues.apache.org/jira/browse/HIVE-16549 > Project: Hive > Issue Type: Bug > Components: storage-api >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > Attachments: HIVE-16549.patch > > > HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a > configuration object. The configuration object is only used for the new > LiteralDelegates. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Reopened] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269
[ https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reopened HIVE-16549: -- > Fix an incompatible change in PredicateLeafImpl from HIVE-15269 > --- > > Key: HIVE-16549 > URL: https://issues.apache.org/jira/browse/HIVE-16549 > Project: Hive > Issue Type: Bug > Components: storage-api >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > Attachments: HIVE-16549.patch > > > HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a > configuration object. The configuration object is only used for the new > LiteralDelegates. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16764) Support numeric as same as decimal
[ https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026558#comment-16026558 ] Pengcheng Xiong commented on HIVE-16764: create_merge_compressed, subquery_scalar are shown in the previous failed tests. can not repo columnstats_part_coltype. query74 should be removed. query14 is flaky. [~ashutoshc], could u please review? thanks. > Support numeric as same as decimal > -- > > Key: HIVE-16764 > URL: https://issues.apache.org/jira/browse/HIVE-16764 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch > > > for example numeric(12,2) -> decimal(12,2) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16764) Support numeric as same as decimal
[ https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16764: --- Status: Open (was: Patch Available) > Support numeric as same as decimal > -- > > Key: HIVE-16764 > URL: https://issues.apache.org/jira/browse/HIVE-16764 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch > > > for example numeric(12,2) -> decimal(12,2) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16764) Support numeric as same as decimal
[ https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16764: --- Attachment: HIVE-16764.02.patch > Support numeric as same as decimal > -- > > Key: HIVE-16764 > URL: https://issues.apache.org/jira/browse/HIVE-16764 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch > > > for example numeric(12,2) -> decimal(12,2) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16764) Support numeric as same as decimal
[ https://issues.apache.org/jira/browse/HIVE-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16764: --- Status: Patch Available (was: Open) > Support numeric as same as decimal > -- > > Key: HIVE-16764 > URL: https://issues.apache.org/jira/browse/HIVE-16764 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16764.01.patch, HIVE-16764.02.patch > > > for example numeric(12,2) -> decimal(12,2) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269
[ https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-16549: - Attachment: HIVE-16549.patch Here's the patch. > Fix an incompatible change in PredicateLeafImpl from HIVE-15269 > --- > > Key: HIVE-16549 > URL: https://issues.apache.org/jira/browse/HIVE-16549 > Project: Hive > Issue Type: Bug > Components: storage-api >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > Attachments: HIVE-16549.patch > > > HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a > configuration object. The configuration object is only used for the new > LiteralDelegates. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
[ https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026537#comment-16026537 ] Thejas M Nair edited comment on HIVE-16323 at 5/26/17 5:25 PM: --- Looks like this needs to be rebased. was (Author: thejas): +1 pending tests > HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204 > --- > > Key: HIVE-16323 > URL: https://issues.apache.org/jira/browse/HIVE-16323 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16323.1.patch, PM_leak.png > > > Hive.loadDynamicPartitions creates threads with new embedded rawstore, but > never close them, thus we leak PersistenceManager one per such thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
[ https://issues.apache.org/jira/browse/HIVE-16323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026537#comment-16026537 ] Thejas M Nair commented on HIVE-16323: -- +1 pending tests > HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204 > --- > > Key: HIVE-16323 > URL: https://issues.apache.org/jira/browse/HIVE-16323 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-16323.1.patch, PM_leak.png > > > Hive.loadDynamicPartitions creates threads with new embedded rawstore, but > never close them, thus we leak PersistenceManager one per such thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026522#comment-16026522 ] Hive QA commented on HIVE-16771: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870089/HIVE-16771.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[load_dyn_part5] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5442/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5442/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5442/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870089 - PreCommit-HIVE-Build > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269
[ https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026506#comment-16026506 ] Gunther Hagleitner commented on HIVE-16549: --- HIVE-14007 is a massive patch. This patch here doesn't seem to have any reviews or test runs or anything. Patch isn't even attached to the ticker. [~owen.omalley] why do the regular rules not apply to you? > Fix an incompatible change in PredicateLeafImpl from HIVE-15269 > --- > > Key: HIVE-16549 > URL: https://issues.apache.org/jira/browse/HIVE-16549 > Project: Hive > Issue Type: Bug > Components: storage-api >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > > HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a > configuration object. The configuration object is only used for the new > LiteralDelegates. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026487#comment-16026487 ] Naveen Gangam commented on HIVE-16771: -- I think it makes sense to have the getMetaStoreSchemaVersion() API on the {{IMetaStoreSchemaInfo}}. 1) However, it seems a bit odd to have it take a boolean to determine if the query needs quotes or not. Can the implementation detect it without a whole lot of code duplication? The impl should be able to determine the DBTYPE just as easily. 2) The connection and the statement are not closed. This is will certainly cause a memory leak and a potentially a connection leak to the DB. 3) Same with the need to have an active SQL connection passed in. But then is there a better means to do this? 4) Ideally, the HMS schema version should only be fetched from DB just once. This implementation fetches it every time. Are there scenarios where the value would be changed after initialization that makes it necessary every time? Thanks > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-16771: --- Status: Patch Available (was: Open) > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-16771: --- Attachment: HIVE-16771.01.patch > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026451#comment-16026451 ] Vihang Karajgaonkar commented on HIVE-16771: Hi [~ngangam] Can you please review? > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-16771.01.patch > > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16769) Possible hive service startup due to the existing file /tmp/stderr
[ https://issues.apache.org/jira/browse/HIVE-16769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026448#comment-16026448 ] Hive QA commented on HIVE-16769: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12870079/HIVE-16769.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10788 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5441/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5441/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5441/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12870079 - PreCommit-HIVE-Build > Possible hive service startup due to the existing file /tmp/stderr > -- > > Key: HIVE-16769 > URL: https://issues.apache.org/jira/browse/HIVE-16769 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-16769.1.patch > > > HIVE-12497 prints the ignoring errors from hadoop version, hbase mapredcp and > hadoop jars to /tmp/$USER/stderr. > In some cases $USER is not set, then the file becomes /tmp/stderr. If such > file preexists with different permission, it will cause the service startup > to fail. > I just tried the script without outputting to stderr file, I don't see such > error any more {{"ERROR StatusLogger No log4j2 configuration file found. > Using default configuration: logging only errors to the console."}}. > I think we can remove such redirect to avoid possible startup failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-16771) Schematool should use MetastoreSchemaInfo to get the metastore schema version from database
[ https://issues.apache.org/jira/browse/HIVE-16771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar reassigned HIVE-16771: -- > Schematool should use MetastoreSchemaInfo to get the metastore schema version > from database > --- > > Key: HIVE-16771 > URL: https://issues.apache.org/jira/browse/HIVE-16771 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > > HIVE-16723 gives the ability to have a custom MetastoreSchemaInfo > implementation to manage schema upgrades and initialization if needed. In > order to make HiveSchemaTool completely agnostic it should depend on > IMetastoreSchemaInfo implementation which is configured to get the metastore > schema version information from the database. It should also not assume the > scripts directory and hardcode it itself. It would rather ask > MetastoreSchemaInfo class to get the metastore scripts directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269
[ https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HIVE-16549. -- Resolution: Fixed > Fix an incompatible change in PredicateLeafImpl from HIVE-15269 > --- > > Key: HIVE-16549 > URL: https://issues.apache.org/jira/browse/HIVE-16549 > Project: Hive > Issue Type: Bug > Components: storage-api >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > > HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a > configuration object. The configuration object is only used for the new > LiteralDelegates. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16549) Fix an incompatible change in PredicateLeafImpl from HIVE-15269
[ https://issues.apache.org/jira/browse/HIVE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-16549: - Fix Version/s: 2.2.0 Component/s: storage-api This has already been fixed in HIVE-14007 for master and branch-2.3. I need a modified fix for branch-2.2. > Fix an incompatible change in PredicateLeafImpl from HIVE-15269 > --- > > Key: HIVE-16549 > URL: https://issues.apache.org/jira/browse/HIVE-16549 > Project: Hive > Issue Type: Bug > Components: storage-api >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 2.2.0 > > > HIVE-15269 added a parameter to the constructor for PredicateLeafImpl for a > configuration object. The configuration object is only used for the new > LiteralDelegates. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16767) Update people website with recent changes
[ https://issues.apache.org/jira/browse/HIVE-16767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026356#comment-16026356 ] Sergio Peña commented on HIVE-16767: Mine looks good. +1 > Update people website with recent changes > - > > Key: HIVE-16767 > URL: https://issues.apache.org/jira/browse/HIVE-16767 > Project: Hive > Issue Type: Task > Components: Documentation >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-16767.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)