[jira] [Commented] (HIVE-17822) Provide an option to skip shading of jars
[ https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208894#comment-16208894 ] Lefty Leverenz commented on HIVE-17822: --- Should this be documented in the wiki? (If so, it needs a TODOC3.0 label.) > Provide an option to skip shading of jars > - > > Key: HIVE-17822 > URL: https://issues.apache.org/jira/browse/HIVE-17822 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 3.0.0 > > Attachments: HIVE-17822.1.patch > > > Maven shade plugin does not have option to skip. Adding it under a profile > can help with skip shade reducing build times. > Maven build profile shows druid and jdbc shade plugin to be slowest (also > hive-exec). For devs not working on druid or jdbc, it will be good to have an > option to skip shading via a profile. With this it will be possible to get a > subminute dev build. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata
[ https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-17825: --- Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to master. Thanks [~thejas] for the review. > Socket not closed when trying to read files to copy over in replication from > metadata > - > > Key: HIVE-17825 > URL: https://issues.apache.org/jira/browse/HIVE-17825 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek >Priority: Critical > Labels: pull-request-available > Fix For: 3.0.0 > > Attachments: HIVE-17825.0.patch > > > for replication we create a _files in hdfs which lists the source files to be > copied over for a table/partition. _files is read in ReplCopyTask to read > what files to be copied. The File operations w.r.t to _files is not correct > and we leave the files open there, which leads to a lot of CLOSE_WAIT > connections to the source Data nodes from HS2 on the replica cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17797) History of API changes for Hive Common
[ https://issues.apache.org/jira/browse/HIVE-17797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208816#comment-16208816 ] Andrey Ponomarenko commented on HIVE-17797: --- Hi, No, it doesn't. There is only static analysis of class/method signatures. > History of API changes for Hive Common > -- > > Key: HIVE-17797 > URL: https://issues.apache.org/jira/browse/HIVE-17797 > Project: Hive > Issue Type: Improvement >Reporter: Andrey Ponomarenko > Attachments: hive-common-1.png, hive-common-2.png > > > Hi, > I'd like to share the report on API changes and backward binary compatibility > for the Hive Common library: > https://abi-laboratory.pro/java/tracker/timeline/hive-common/ > The report is generated by the https://github.com/lvc/japi-tracker tool for > jars found at http://central.maven.org/maven2/org/apache/hive/hive-common/ > according to https://wiki.eclipse.org/Evolving_Java-based_APIs_2. > Feel free to request other Hive modules to be included to the tracker if you > are interested. > Also please let me know if the tool should not check some parts of the API > (it checks all public API methods and classes by default). > Thank you. > !hive-common-2.png|API symbols timeline! > !hive-common-1.png|API changes review! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables
[ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208812#comment-16208812 ] Hive QA commented on HIVE-12631: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892732/HIVE-12631.31.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11280 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7361/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7361/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7361/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892732 - PreCommit-HIVE-Build > LLAP: support ORC ACID tables > - > > Key: HIVE-12631 > URL: https://issues.apache.org/jira/browse/HIVE-12631 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-12631.1.patch, HIVE-12631.10.patch, > HIVE-12631.10.patch, HIVE-12631.11.patch, HIVE-12631.11.patch, > HIVE-12631.12.patch, HIVE-12631.13.patch, HIVE-12631.15.patch, > HIVE-12631.16.patch, HIVE-12631.17.patch, HIVE-12631.18.patch, > HIVE-12631.19.patch, HIVE-12631.2.patch, HIVE-12631.20.patch, > HIVE-12631.21.patch, HIVE-12631.22.patch, HIVE-12631.23.patch, > HIVE-12631.24.patch, HIVE-12631.25.patch, HIVE-12631.26.patch, > HIVE-12631.27.patch, HIVE-12631.28.patch, HIVE-12631.29.patch, > HIVE-12631.3.patch, HIVE-12631.30.patch, HIVE-12631.31.patch, > HIVE-12631.4.patch, HIVE-12631.5.patch, HIVE-12631.6.patch, > HIVE-12631.7.patch, HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch > > > LLAP uses a completely separate read path in ORC to allow for caching and > parallelization of reads and processing. This path does not support ACID. As > far as I remember ACID logic is embedded inside ORC format; we need to > refactor it to be on top of some interface, if practical; or just port it to > LLAP read path. > Another consideration is how the logic will work with cache. The cache is > currently low-level (CB-level in ORC), so we could just use it to read bases > and deltas (deltas should be cached with higher priority) and merge as usual. > We could also cache merged representation in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission. Only one user could ever create an external table defin
[ https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208779#comment-16208779 ] Hive QA commented on HIVE-12408: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892720/HIVE-12408.002.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11277 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=233) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat2] (batchId=81) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc] (batchId=93) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7360/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7360/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7360/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892720 - PreCommit-HIVE-Build > SQLStdAuthorizer expects external table creator to be owner of directory, > does not respect rwx group permission. Only one user could ever create an > external table definition to dir! > - > > Key: HIVE-12408 > URL: https://issues.apache.org/jira/browse/HIVE-12408 > Project: Hive > Issue Type: Bug > Components: Authorization, Security, SQLStandardAuthorization >Affects Versions: 0.14.0 > Environment: HDP 2.2 + Kerberos >Reporter: Hari Sekhon >Assignee: Akira Ajisaka >Priority: Critical > Attachments: HIVE-12408.001.patch, HIVE-12408.002.patch > > > When trying to create an external table via beeline in Hive using the > SQLStdAuthorizer it expects the table creator to be the owner of the > directory path and ignores the group rwx permission that is granted to the > user. > {code}Error: Error while compiling statement: FAILED: > HiveAccessControlException Permission denied: Principal [name=hari, > type=USER] does not have following privileges for operation CREATETABLE > [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, > name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code} > All it should be checking is read access to that directory. > The directory owner requirement breaks the ability of more than one user to > create external table definitions to a given location. For example this is a > flume landing directory with json data, and the /etl tree is owned by the > flume user. Even chowning the tree to another user would still break access > to other users who are able to read the directory in hdfs but would still > unable to create external tables on top of it. > This looks like a remnant of the owner only access model in SQLStdAuth and is > a separate issue to HIVE-11864 / HIVE-12324. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables
[ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-12631: -- Attachment: HIVE-12631.31.patch > LLAP: support ORC ACID tables > - > > Key: HIVE-12631 > URL: https://issues.apache.org/jira/browse/HIVE-12631 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-12631.1.patch, HIVE-12631.10.patch, > HIVE-12631.10.patch, HIVE-12631.11.patch, HIVE-12631.11.patch, > HIVE-12631.12.patch, HIVE-12631.13.patch, HIVE-12631.15.patch, > HIVE-12631.16.patch, HIVE-12631.17.patch, HIVE-12631.18.patch, > HIVE-12631.19.patch, HIVE-12631.2.patch, HIVE-12631.20.patch, > HIVE-12631.21.patch, HIVE-12631.22.patch, HIVE-12631.23.patch, > HIVE-12631.24.patch, HIVE-12631.25.patch, HIVE-12631.26.patch, > HIVE-12631.27.patch, HIVE-12631.28.patch, HIVE-12631.29.patch, > HIVE-12631.3.patch, HIVE-12631.30.patch, HIVE-12631.31.patch, > HIVE-12631.4.patch, HIVE-12631.5.patch, HIVE-12631.6.patch, > HIVE-12631.7.patch, HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch > > > LLAP uses a completely separate read path in ORC to allow for caching and > parallelization of reads and processing. This path does not support ACID. As > far as I remember ACID logic is embedded inside ORC format; we need to > refactor it to be on top of some interface, if practical; or just port it to > LLAP read path. > Another consideration is how the logic will work with cache. The cache is > currently low-level (CB-level in ORC), so we could just use it to read bases > and deltas (deltas should be cached with higher priority) and merge as usual. > We could also cache merged representation in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17434) Using "add jar " from viewFs always occurred hdfs mismatch error
[ https://issues.apache.org/jira/browse/HIVE-17434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bang Xiao updated HIVE-17434: - Status: Patch Available (was: Open) > Using "add jar " from viewFs always occurred hdfs mismatch error > > > Key: HIVE-17434 > URL: https://issues.apache.org/jira/browse/HIVE-17434 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: shenxianqiang >Assignee: Bang Xiao >Priority: Minor > Fix For: 1.2.1 > > Attachments: HIVE-17434-1.patch, HIVE-17434.patch > > > add jar viewfs://nsX//lib/common.jar > always occure mismatch error -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17434) Using "add jar " from viewFs always occurred hdfs mismatch error
[ https://issues.apache.org/jira/browse/HIVE-17434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bang Xiao updated HIVE-17434: - Status: Open (was: Patch Available) > Using "add jar " from viewFs always occurred hdfs mismatch error > > > Key: HIVE-17434 > URL: https://issues.apache.org/jira/browse/HIVE-17434 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: shenxianqiang >Assignee: Bang Xiao >Priority: Minor > Fix For: 1.2.1 > > Attachments: HIVE-17434-1.patch, HIVE-17434.patch > > > add jar viewfs://nsX//lib/common.jar > always occure mismatch error -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208740#comment-16208740 ] Hive QA commented on HIVE-17458: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892718/HIVE-17458.01.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 11278 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testNonAcidToAcidVectorzied (batchId=272) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7359/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7359/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7359/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892718 - PreCommit-HIVE-Build > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17473) implement workload management pools
[ https://issues.apache.org/jira/browse/HIVE-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208698#comment-16208698 ] Hive QA commented on HIVE-17473: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892699/HIVE-17473.03.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11279 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.service.server.TestHS2HttpServer.org.apache.hive.service.server.TestHS2HttpServer (batchId=201) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7358/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7358/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7358/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892699 - PreCommit-HIVE-Build > implement workload management pools > --- > > Key: HIVE-17473 > URL: https://issues.apache.org/jira/browse/HIVE-17473 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17473.01.patch, HIVE-17473.03.patch, > HIVE-17473.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17827) refactor TezTask file management
[ https://issues.apache.org/jira/browse/HIVE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17827: Description: There are about 5 different duplicate and intersecting structures used in TezTask to manage the file list (additionalLr, inputOutputX xN (array and list and/or map?), resourceMap; plus a list in the actual session object); multiple methods named addExtraResourcesBlah, localizeResourcesBlah; at least 2 places where TezClient is invoked to add resources to AM, on the same path. All this mess needs to be changed to have exactly one place that figures out what needs to be localized, and exactly one place (in TezSessionState so we don't duplicate state and TezClient usage inside and outside of it) where we look at what's already there and localize things that are not. Meanwhile signatures can also be cleaned up, e.g. since branch-1 jobConf is passed to updateSession and ignored, etc. was: There are about 5 different duplicate and intersecting structures used in TezTask to manage the file list (additionalLr, inputOutputX xN (array and list and/or map?), resourceMap; plus a list in the actual session object); multiple methods named addExtraResourcesBlah, localizeResourcesBlah; at least 2 places where TezClient is invoked to add resources to AM, on the same path. All this mess needs to be changed to have exactly one place that figures out what needs to be localized, and exactly one place (in TezSessionState so we don't duplicate state and TezClient usage inside and outside of it) where we look at what's already there and localize things that are not. > refactor TezTask file management > > > Key: HIVE-17827 > URL: https://issues.apache.org/jira/browse/HIVE-17827 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > There are about 5 different duplicate and intersecting structures used in > TezTask to manage the file list (additionalLr, inputOutputX xN (array and > list and/or map?), resourceMap; plus a list in the actual session object); > multiple methods named addExtraResourcesBlah, localizeResourcesBlah; at least > 2 places where TezClient is invoked to add resources to AM, on the same path. > All this mess needs to be changed to have exactly one place that figures out > what needs to be localized, and exactly one place (in TezSessionState so we > don't duplicate state and TezClient usage inside and outside of it) where we > look at what's already there and localize things that are not. > Meanwhile signatures can also be cleaned up, e.g. since branch-1 jobConf is > passed to updateSession and ignored, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17827) refactor TezTask file management
[ https://issues.apache.org/jira/browse/HIVE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17827: Description: There are about 5 different duplicate and intersecting structures used in TezTask so to manage the file list (additionalLr, inputOutputX xN (array and list and/or map?), resourceMap; plus a list in the actual session object); multiple methods named addExtraResourcesBlah, localizeResourcesBlah; at least 2 places where TezClient is invoked to add resources to AM, on the same path. All this mess needs to be changed to have exactly one place that figures out what needs to be localized, and exactly one place (in TezSessionState so we don't duplicate state and TezClient usage inside and outside of it) where we look at what's already there and localize things that are not. was: There are about 5 different duplicate and intersecting structures used in TezTask so to manage the file list (additionalLr, inputOutputX x2 (array and list and/or map?), resourceMap; plus a list in the actual session object; multiple methods called like addExtraResourcesBlah, localizeBlah; at least 2 places where TezClient is invoked to add resources to AM, on the same path. All this mess needs to be changed to have exactly one place that figures out what needs to be localized, and exactly one place (in TezSessionState so we don't duplicate state and TezClient usage inside and outside of it) where we look at what's already there and localize things that are not. > refactor TezTask file management > > > Key: HIVE-17827 > URL: https://issues.apache.org/jira/browse/HIVE-17827 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > There are about 5 different duplicate and intersecting structures used in > TezTask so to manage the file list (additionalLr, inputOutputX xN (array and > list and/or map?), resourceMap; plus a list in the actual session object); > multiple methods named addExtraResourcesBlah, localizeResourcesBlah; at least > 2 places where TezClient is invoked to add resources to AM, on the same path. > All this mess needs to be changed to have exactly one place that figures out > what needs to be localized, and exactly one place (in TezSessionState so we > don't duplicate state and TezClient usage inside and outside of it) where we > look at what's already there and localize things that are not. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17827) refactor TezTask file management
[ https://issues.apache.org/jira/browse/HIVE-17827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17827: Description: There are about 5 different duplicate and intersecting structures used in TezTask to manage the file list (additionalLr, inputOutputX xN (array and list and/or map?), resourceMap; plus a list in the actual session object); multiple methods named addExtraResourcesBlah, localizeResourcesBlah; at least 2 places where TezClient is invoked to add resources to AM, on the same path. All this mess needs to be changed to have exactly one place that figures out what needs to be localized, and exactly one place (in TezSessionState so we don't duplicate state and TezClient usage inside and outside of it) where we look at what's already there and localize things that are not. was: There are about 5 different duplicate and intersecting structures used in TezTask so to manage the file list (additionalLr, inputOutputX xN (array and list and/or map?), resourceMap; plus a list in the actual session object); multiple methods named addExtraResourcesBlah, localizeResourcesBlah; at least 2 places where TezClient is invoked to add resources to AM, on the same path. All this mess needs to be changed to have exactly one place that figures out what needs to be localized, and exactly one place (in TezSessionState so we don't duplicate state and TezClient usage inside and outside of it) where we look at what's already there and localize things that are not. > refactor TezTask file management > > > Key: HIVE-17827 > URL: https://issues.apache.org/jira/browse/HIVE-17827 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > There are about 5 different duplicate and intersecting structures used in > TezTask to manage the file list (additionalLr, inputOutputX xN (array and > list and/or map?), resourceMap; plus a list in the actual session object); > multiple methods named addExtraResourcesBlah, localizeResourcesBlah; at least > 2 places where TezClient is invoked to add resources to AM, on the same path. > All this mess needs to be changed to have exactly one place that figures out > what needs to be localized, and exactly one place (in TezSessionState so we > don't duplicate state and TezClient usage inside and outside of it) where we > look at what's already there and localize things that are not. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission. Only one user could ever create an external table definit
[ https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HIVE-12408: - Attachment: HIVE-12408.002.patch Thanks [~thejas] for the comment! Agreed to make it more consistent. Updated the patch. > SQLStdAuthorizer expects external table creator to be owner of directory, > does not respect rwx group permission. Only one user could ever create an > external table definition to dir! > - > > Key: HIVE-12408 > URL: https://issues.apache.org/jira/browse/HIVE-12408 > Project: Hive > Issue Type: Bug > Components: Authorization, Security, SQLStandardAuthorization >Affects Versions: 0.14.0 > Environment: HDP 2.2 + Kerberos >Reporter: Hari Sekhon >Assignee: Akira Ajisaka >Priority: Critical > Attachments: HIVE-12408.001.patch, HIVE-12408.002.patch > > > When trying to create an external table via beeline in Hive using the > SQLStdAuthorizer it expects the table creator to be the owner of the > directory path and ignores the group rwx permission that is granted to the > user. > {code}Error: Error while compiling statement: FAILED: > HiveAccessControlException Permission denied: Principal [name=hari, > type=USER] does not have following privileges for operation CREATETABLE > [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, > name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code} > All it should be checking is read access to that directory. > The directory owner requirement breaks the ability of more than one user to > create external table definitions to a given location. For example this is a > flume landing directory with json data, and the /etl tree is owned by the > flume user. Even chowning the tree to another user would still break access > to other users who are able to read the directory in hdfs but would still > unable to create external tables on top of it. > This looks like a remnant of the owner only access model in SQLStdAuth and is > a separate issue to HIVE-11864 / HIVE-12324. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Status: Patch Available (was: Open) > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
[ https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17458: -- Attachment: HIVE-17458.01.patch 01.patch is a prototype to enable _VectorizedOrcAcidRowBatchReader_ for "select" queries reading "original" files w/o any deletes > VectorizedOrcAcidRowBatchReader doesn't handle 'original' files > --- > > Key: HIVE-17458 > URL: https://issues.apache.org/jira/browse/HIVE-17458 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-17458.01.patch > > > VectorizedOrcAcidRowBatchReader will not be used for original files. This > will likely look like a perf regression when converting a table from non-acid > to acid until it runs through a major compaction. > With Load Data support, if large files are added via Load Data, the read ops > will not vectorize until major compaction. > There is no reason why this should be the case. Just like > OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other > files in the logical tranche/bucket and calculate the offset for the RowBatch > of the split. (Presumably getRecordReader().getRowNumber() works the same in > vector mode). > In this case we don't even need OrcSplit.isOriginal() - the reader can infer > it from file path... which in particular simplifies > OrcInputFormat.determineSplitStrategies() -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17425) Change MetastoreConf.ConfVars internal members to be private
[ https://issues.apache.org/jira/browse/HIVE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208670#comment-16208670 ] Alexander Kolbasov commented on HIVE-17425: --- [~alangates] Can you add a link to the reviewboard or pull request here? > Change MetastoreConf.ConfVars internal members to be private > > > Key: HIVE-17425 > URL: https://issues.apache.org/jira/browse/HIVE-17425 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17425.2.patch, HIVE-17425.patch > > > MetastoreConf's dual use of metastore keys and Hive keys is causing confusion > for developers. We should make the relevant members private and provide > getter methods with comments on when it is appropriate to use them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17802) Remove unnecessary calls to FileSystem.setOwner() from FileOutputCommitterContainer
[ https://issues.apache.org/jira/browse/HIVE-17802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208650#comment-16208650 ] Hive QA commented on HIVE-17802: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892701/HIVE-17802.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 10882 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=47) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=101) org.apache.hadoop.hive.cli.TestNegativeCliDriver.org.apache.hadoop.hive.cli.TestNegativeCliDriver (batchId=91) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7357/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7357/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7357/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892701 - PreCommit-HIVE-Build > Remove unnecessary calls to FileSystem.setOwner() from > FileOutputCommitterContainer > --- > > Key: HIVE-17802 > URL: https://issues.apache.org/jira/browse/HIVE-17802 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17802.1.patch, HIVE-17802.2.patch > > > For large Pig/HCat queries that produce a large number of > partitions/directories/files, we have seen cases where the HDFS NameNode > groaned under the weight of {{FileSystem.setOwner()}} calls, originating from > the commit-step. This was the result of the following code in > FileOutputCommitterContainer: > {code:java} > private void applyGroupAndPerms(FileSystem fs, Path dir, FsPermission > permission, > List acls, String group, boolean recursive) > throws IOException { > ... > if (recursive) { > for (FileStatus fileStatus : fs.listStatus(dir)) { > if (fileStatus.isDir()) { > applyGroupAndPerms(fs, fileStatus.getPath(), permission, acls, > group, true); > } else { > fs.setPermission(fileStatus.getPath(), permission); > chown(fs, fileStatus.getPath(), group); > } > } > } > } > private void chown(FileSystem fs, Path file, String group) throws > IOException { > try { > fs.setOwner(file, null, group); > } catch (AccessControlException ignore) { > // Some users have wrong table group, ignore it. > LOG.warn("Failed to change group of partition directories/files: " + > file, ignore); > } > } > {code} > One call per file/directory is far too many. We have a patch that reduces the > namenode pressure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed
[ https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208636#comment-16208636 ] Andrew Sherman commented on HIVE-17826: --- [HIVE-17128] prevented file descriptors leaks by closing the log4j Appender used for Operation Logs. Sometimes logging is attempted even after the Operation is closed. The current appender, a RandomAccessFileAppender, will continue to try to write to its log file even after it has been stopped. To fix this, create a new class, HushableMutableRandomAccessAppender based on log4j's RandomAccessFileAppender. https://github.com/apache/logging-log4j2/blob/master/log4j-core/src/main/java/org/apache/logging/log4j/core/appender/RandomAccessFileAppender.java Unfortunately that class is final and hard to extend by delegation so the code is copied here. The only substantive change is that a stopped HushableMutableRandomAccessAppender will no longer append after it has been stopped. {noformat} if (isStopped()) { // Don't try to log anything when appender is stopped return; } {noformat} Make log4j OperationLogging use the CloseableRandomAccessFileAppender Add a test that writes to the appender after it has been stopped. Add a minimal LogEvent implementation for testing > Error writing to RandomAccessFile after operation log is closed > --- > > Key: HIVE-17826 > URL: https://issues.apache.org/jira/browse/HIVE-17826 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > > We are seeing the error from HS2 process stdout. > {noformat} > 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > for appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing > Appender query-file-appender > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error > writing to RandomAccessFile > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103) > at > org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105) > at > org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:112) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:390) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:378) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:362) > at > org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79) > at > org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:385) > at > org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:103) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:43) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.
[jira] [Assigned] (HIVE-17826) Error writing to RandomAccessFile after operation log is closed
[ https://issues.apache.org/jira/browse/HIVE-17826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Sherman reassigned HIVE-17826: - > Error writing to RandomAccessFile after operation log is closed > --- > > Key: HIVE-17826 > URL: https://issues.apache.org/jira/browse/HIVE-17826 > Project: Hive > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Andrew Sherman > > We are seeing the error from HS2 process stdout. > {noformat} > 2017-09-07 10:17:23,933 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,934 AsyncLogger-1 ERROR Attempted to append to > non-started appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR Unable to write to stream > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > for appender query-file-appender > 2017-09-07 10:17:23,935 AsyncLogger-1 ERROR An exception occurred processing > Appender query-file-appender > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error > writing to RandomAccessFile > /var/log/hive/operation_logs/dd38df5b-3c09-48c9-ad64-a2eee093bea6/hive_20170907101723_1a6ad4b9-f662-4e7a-a495-06e3341308f9 > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:114) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.write(RandomAccessFileManager.java:103) > at > org.apache.logging.log4j.core.appender.OutputStreamManager.write(OutputStreamManager.java:136) > at > org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:105) > at > org.apache.logging.log4j.core.appender.RandomAccessFileAppender.append(RandomAccessFileAppender.java:89) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:112) > at > org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:152) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:125) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:116) > at > org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84) > at > org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:390) > at > org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:378) > at > org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:362) > at > org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79) > at > org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:385) > at > org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:103) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:43) > at > org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:28) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Stream Closed > at java.io.RandomAccessFile.writeBytes(Native Method) > at java.io.RandomAccessFile.write(RandomAccessFile.java:525) > at > org.apache.logging.log4j.core.appender.RandomAccessFileManager.flush(RandomAccessFileManager.java:111) > ... 25 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17778) Add support for custom counters in trigger expression
[ https://issues.apache.org/jira/browse/HIVE-17778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208596#comment-16208596 ] Hive QA commented on HIVE-17778: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892685/HIVE-17778.3.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 50 failed/errored test(s), 11285 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=241) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_move] (batchId=244) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_merge_only] (batchId=244) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions_move_only] (batchId=244) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[dynpart_merge] (batchId=35) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=27) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part14] (batchId=86) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge3] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition2] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_dynamic_partition3] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_move_tasks_share_dependencies] (batchId=53) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nullformatdir] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge10] (batchId=62) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge1] (batchId=19) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_diff_fs] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_transform] (batchId=74) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge10] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge1] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_merge_diff_fs] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[schemeAuthority2] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[schemeAuthority] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multi_insert] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_insert_overwrite_local_directory_1] (batchId=149) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge1] (batchId=171) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_diff_fs] (batchId=171) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[schemeAuthority2] (batchId=174) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[schemeAuthority] (batchId=172) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_uri_insert_local] (batchId=92) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_insert] (batchId=117) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[multi_insert_move_tasks_share_dependencies] (batchId=126) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_transform] (batchId=136) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.exec.TestFileSinkOperator.testDeleteDynamicPartitioning (batchId=274) org.apache.hadoop.hive.ql.exec.TestFileSinkOperator.testInsertDynamicPartitioning (batchId=274) org.apache.hadoop.hive.ql.exec.TestFileSinkOperator.testNonAcidDynamicPartitioning (batchId=274) org.
[jira] [Commented] (HIVE-17230) Timestamp format different in HiveCLI and Beeline
[ https://issues.apache.org/jira/browse/HIVE-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208572#comment-16208572 ] Aihua Xu commented on HIVE-17230: - Thanks [~kgyrtkirk] for looking into it. Seems like the standard is more about how to store the timestamp data http://troels.arvin.dk/db/rdbms/#data_types-date_and_time-timestamp, not about how to display the timestamp. So either will be fine if the result is correct. Hive follows more with Oracle and Postgres, but Oracle shows the trailing 0 and postgres removes the trailing 0. [~jdere] Do you have opinion on this? > Timestamp format different in HiveCLI and Beeline > - > > Key: HIVE-17230 > URL: https://issues.apache.org/jira/browse/HIVE-17230 > Project: Hive > Issue Type: Bug > Components: Beeline, CLI >Reporter: Peter Vary >Assignee: Marta Kuczora > Attachments: HIVE-17230.1.patch, HIVE-17230.2.patch, > HIVE-17230.3.patch > > > The issue can be reproduced with the following commands: > {code} > create table timestamp_test(t timestamp); > insert into table timestamp_test values('2000-01-01 01:00:00'); > select * from timestamp_test; > {code} > The timestamp is displayed without nanoseconds in HiveCLI: > {code} > 2000-01-01 01:00:00 > {code} > When the exact same timestamp is displayed in BeeLine it displays: > {code} > 2000-01-01 01:00:00.0 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17802) Remove unnecessary calls to FileSystem.setOwner() from FileOutputCommitterContainer
[ https://issues.apache.org/jira/browse/HIVE-17802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17802: Attachment: HIVE-17802.2.patch > Remove unnecessary calls to FileSystem.setOwner() from > FileOutputCommitterContainer > --- > > Key: HIVE-17802 > URL: https://issues.apache.org/jira/browse/HIVE-17802 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17802.1.patch, HIVE-17802.2.patch > > > For large Pig/HCat queries that produce a large number of > partitions/directories/files, we have seen cases where the HDFS NameNode > groaned under the weight of {{FileSystem.setOwner()}} calls, originating from > the commit-step. This was the result of the following code in > FileOutputCommitterContainer: > {code:java} > private void applyGroupAndPerms(FileSystem fs, Path dir, FsPermission > permission, > List acls, String group, boolean recursive) > throws IOException { > ... > if (recursive) { > for (FileStatus fileStatus : fs.listStatus(dir)) { > if (fileStatus.isDir()) { > applyGroupAndPerms(fs, fileStatus.getPath(), permission, acls, > group, true); > } else { > fs.setPermission(fileStatus.getPath(), permission); > chown(fs, fileStatus.getPath(), group); > } > } > } > } > private void chown(FileSystem fs, Path file, String group) throws > IOException { > try { > fs.setOwner(file, null, group); > } catch (AccessControlException ignore) { > // Some users have wrong table group, ignore it. > LOG.warn("Failed to change group of partition directories/files: " + > file, ignore); > } > } > {code} > One call per file/directory is far too many. We have a patch that reduces the > namenode pressure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17473) implement workload management pools
[ https://issues.apache.org/jira/browse/HIVE-17473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17473: Attachment: HIVE-17473.03.patch Rebased the patch > implement workload management pools > --- > > Key: HIVE-17473 > URL: https://issues.apache.org/jira/browse/HIVE-17473 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17473.01.patch, HIVE-17473.03.patch, > HIVE-17473.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17431) change configuration handling in TezSessionState
[ https://issues.apache.org/jira/browse/HIVE-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208535#comment-16208535 ] Siddharth Seth commented on HIVE-17431: --- So. new sessions use new configs (and there's enough checks in place to reset these sessions / launch new ones if sufficient context changes), and when there's not sufficient context change, then local reources are added for the DAG? Makes sense > change configuration handling in TezSessionState > > > Key: HIVE-17431 > URL: https://issues.apache.org/jira/browse/HIVE-17431 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17431.patch > > > The configuration is only set when opening the session; that seems > unnecessary - it could be set in the ctor and made final. E.g. when updating > the session and localizing new resources we may theoretically open the > session with a new config, but we don't update the config and only update the > files if the session is already open, which seems to imply that it's ok to > not update the config. > In most cases, the session is opened only once or reopened without intending > to change the config (e.g. if it times out). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-12408) SQLStdAuthorizer expects external table creator to be owner of directory, does not respect rwx group permission. Only one user could ever create an external table defin
[ https://issues.apache.org/jira/browse/HIVE-12408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208510#comment-16208510 ] Thejas M Nair commented on HIVE-12408: -- Should all use of OWNER_INS_SEL_DEL_NOGRANT_AR be changed to not ask for ownership ? That would be more consistent. > SQLStdAuthorizer expects external table creator to be owner of directory, > does not respect rwx group permission. Only one user could ever create an > external table definition to dir! > - > > Key: HIVE-12408 > URL: https://issues.apache.org/jira/browse/HIVE-12408 > Project: Hive > Issue Type: Bug > Components: Authorization, Security, SQLStandardAuthorization >Affects Versions: 0.14.0 > Environment: HDP 2.2 + Kerberos >Reporter: Hari Sekhon >Assignee: Akira Ajisaka >Priority: Critical > Attachments: HIVE-12408.001.patch > > > When trying to create an external table via beeline in Hive using the > SQLStdAuthorizer it expects the table creator to be the owner of the > directory path and ignores the group rwx permission that is granted to the > user. > {code}Error: Error while compiling statement: FAILED: > HiveAccessControlException Permission denied: Principal [name=hari, > type=USER] does not have following privileges for operation CREATETABLE > [[INSERT, DELETE, OBJECT OWNERSHIP] on Object [type=DFS_URI, > name=/etl/path/to/hdfs/dir]] (state=42000,code=4){code} > All it should be checking is read access to that directory. > The directory owner requirement breaks the ability of more than one user to > create external table definitions to a given location. For example this is a > flume landing directory with json data, and the /etl tree is owned by the > flume user. Even chowning the tree to another user would still break access > to other users who are able to read the directory in hdfs but would still > unable to create external tables on top of it. > This looks like a remnant of the owner only access model in SQLStdAuth and is > a separate issue to HIVE-11864 / HIVE-12324. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params
[ https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208507#comment-16208507 ] Thejas M Nair commented on HIVE-8937: - +1 > fix description of hive.security.authorization.sqlstd.confwhitelist.* params > > > Key: HIVE-8937 > URL: https://issues.apache.org/jira/browse/HIVE-8937 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 0.14.0 >Reporter: Thejas M Nair >Assignee: Akira Ajisaka > Attachments: HIVE-8937.001.patch, HIVE-8937.002.patch > > > hive.security.authorization.sqlstd.confwhitelist.* param description in > HiveConf is incorrect. The expected value is a regex, not comma separated > regexes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17792) Enable Bucket Map Join when there are extra keys other than bucketed columns
[ https://issues.apache.org/jira/browse/HIVE-17792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17792: -- Component/s: Tez Query Planning > Enable Bucket Map Join when there are extra keys other than bucketed columns > > > Key: HIVE-17792 > URL: https://issues.apache.org/jira/browse/HIVE-17792 > Project: Hive > Issue Type: Bug > Components: Query Planning, Tez >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Fix For: 3.0.0 > > Attachments: HIVE-17792.1.patch, HIVE-17792.2.patch, > HIVE-17792.3.patch, HIVE-17792.4.patch, HIVE-17792.5.patch > > > Currently this wont go through Bucket Map Join(BMJ) > CREATE TABLE tab_part (key int, value string) PARTITIONED BY(ds STRING) > CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE; > CREATE TABLE tab(key int, value string) PARTITIONED BY(ds STRING) STORED AS > TEXTFILE; > select a.key, a.value, b.value > from tab a join tab_part b on a.key = b.key and a.value = b.value; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17792) Enable Bucket Map Join when there are extra keys other than bucketed columns
[ https://issues.apache.org/jira/browse/HIVE-17792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17792: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master > Enable Bucket Map Join when there are extra keys other than bucketed columns > > > Key: HIVE-17792 > URL: https://issues.apache.org/jira/browse/HIVE-17792 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Fix For: 3.0.0 > > Attachments: HIVE-17792.1.patch, HIVE-17792.2.patch, > HIVE-17792.3.patch, HIVE-17792.4.patch, HIVE-17792.5.patch > > > Currently this wont go through Bucket Map Join(BMJ) > CREATE TABLE tab_part (key int, value string) PARTITIONED BY(ds STRING) > CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE; > CREATE TABLE tab(key int, value string) PARTITIONED BY(ds STRING) STORED AS > TEXTFILE; > select a.key, a.value, b.value > from tab a join tab_part b on a.key = b.key and a.value = b.value; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17804) Vectorization: Bug erroneously causes match for 1st row in batch (SelectStringColLikeStringScalar)
[ https://issues.apache.org/jira/browse/HIVE-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-17804: Resolution: Fixed Status: Resolved (was: Patch Available) > Vectorization: Bug erroneously causes match for 1st row in batch > (SelectStringColLikeStringScalar) > -- > > Key: HIVE-17804 > URL: https://issues.apache.org/jira/browse/HIVE-17804 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17804.01.patch, HIVE-17804.02.patch > > > Code setting output value to LongColumnVector.NULL_VALUE for null candidate > sets the 0th entry instead of the i'th. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17765) expose Hive keywords
[ https://issues.apache.org/jira/browse/HIVE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208437#comment-16208437 ] Thejas M Nair commented on HIVE-17765: -- * In TestJdbcDriver2.java, can you change the single line for loop to following to make it more readable ? {code} .. boolean found = false; for(String keyword : keywords) { if( "limit".equals(keyword) ){ found = true; } } ... {code} * service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java . Can you make that an umodifiable set ? (Guava Collections.unmodifiableSet ). * You can also initialize above in constructor using "new HashSet<> (Arrays.asList("ABSOLUTE", ... ) " > expose Hive keywords > - > > Key: HIVE-17765 > URL: https://issues.apache.org/jira/browse/HIVE-17765 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Thejas M Nair > Attachments: HIVE-17765.01.patch, HIVE-17765.02.patch, > HIVE-17765.nogen.patch, HIVE-17765.patch > > > This could be useful e.g. for BI tools (via ODBC/JDBC drivers) to decide on > SQL capabilities of Hive -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17804) Vectorization: Bug erroneously causes match for 1st row in batch (SelectStringColLikeStringScalar)
[ https://issues.apache.org/jira/browse/HIVE-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208436#comment-16208436 ] Matt McCline commented on HIVE-17804: - Test failures are unrelated. > Vectorization: Bug erroneously causes match for 1st row in batch > (SelectStringColLikeStringScalar) > -- > > Key: HIVE-17804 > URL: https://issues.apache.org/jira/browse/HIVE-17804 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17804.01.patch, HIVE-17804.02.patch > > > Code setting output value to LongColumnVector.NULL_VALUE for null candidate > sets the 0th entry instead of the i'th. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17425) Change MetastoreConf.ConfVars internal members to be private
[ https://issues.apache.org/jira/browse/HIVE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208435#comment-16208435 ] Hive QA commented on HIVE-17425: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892656/HIVE-17425.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11276 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=229) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7355/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7355/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7355/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892656 - PreCommit-HIVE-Build > Change MetastoreConf.ConfVars internal members to be private > > > Key: HIVE-17425 > URL: https://issues.apache.org/jira/browse/HIVE-17425 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17425.2.patch, HIVE-17425.patch > > > MetastoreConf's dual use of metastore keys and Hive keys is causing confusion > for developers. We should make the relevant members private and provide > getter methods with comments on when it is appropriate to use them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17765) expose Hive keywords
[ https://issues.apache.org/jira/browse/HIVE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair reassigned HIVE-17765: Assignee: Thejas M Nair (was: Sergey Shelukhin) > expose Hive keywords > - > > Key: HIVE-17765 > URL: https://issues.apache.org/jira/browse/HIVE-17765 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Thejas M Nair > Attachments: HIVE-17765.01.patch, HIVE-17765.02.patch, > HIVE-17765.nogen.patch, HIVE-17765.patch > > > This could be useful e.g. for BI tools (via ODBC/JDBC drivers) to decide on > SQL capabilities of Hive -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17778) Add support for custom counters in trigger expression
[ https://issues.apache.org/jira/browse/HIVE-17778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17778: - Attachment: HIVE-17778.3.patch > Add support for custom counters in trigger expression > - > > Key: HIVE-17778 > URL: https://issues.apache.org/jira/browse/HIVE-17778 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17778.1.patch, HIVE-17778.2.patch, > HIVE-17778.3.patch > > > HIVE-17508 only supports limited counters. This ticket is to extend it to > support custom counters (counters that are not supported by execution engine > will be dropped). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17778) Add support for custom counters in trigger expression
[ https://issues.apache.org/jira/browse/HIVE-17778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17778: - Status: Patch Available (was: Open) > Add support for custom counters in trigger expression > - > > Key: HIVE-17778 > URL: https://issues.apache.org/jira/browse/HIVE-17778 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17778.1.patch, HIVE-17778.2.patch, > HIVE-17778.3.patch > > > HIVE-17508 only supports limited counters. This ticket is to extend it to > support custom counters (counters that are not supported by execution engine > will be dropped). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17804) Vectorization: Bug erroneously causes match for 1st row in batch (SelectStringColLikeStringScalar)
[ https://issues.apache.org/jira/browse/HIVE-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208260#comment-16208260 ] Hive QA commented on HIVE-17804: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892654/HIVE-17804.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11277 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] (batchId=10) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7354/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7354/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7354/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892654 - PreCommit-HIVE-Build > Vectorization: Bug erroneously causes match for 1st row in batch > (SelectStringColLikeStringScalar) > -- > > Key: HIVE-17804 > URL: https://issues.apache.org/jira/browse/HIVE-17804 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17804.01.patch, HIVE-17804.02.patch > > > Code setting output value to LongColumnVector.NULL_VALUE for null candidate > sets the 0th entry instead of the i'th. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17431) change configuration handling in TezSessionState
[ https://issues.apache.org/jira/browse/HIVE-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208230#comment-16208230 ] Sergey Shelukhin edited comment on HIVE-17431 at 10/17/17 7:53 PM: --- W.r.t. the logical problems (the other stuff I will address in due course) - looking at code in branch-2, before all the fancy changes, I see many paths where the user config would be applied; however, those are all for a new, or failed and reopened, session. The main pool use path goes like this: TezTask calls getSession with its unique config. That calls canWork..., that only checks doAs and queue name; then calls the private getSession/2. That again checks doAs and queue, also validates the config; however, if doAs and queue check out, it returns the session from the pool without ever using conf again. Then, the only way conf interacts with the session is if it's not open (shouldn't happen with pool sessions on main path), or to add resources. Adding resources passes the conf from TezTask around, and doesn't interact with session's own config. Only resources are changed based on the external (to the session) conf object. So, on this path the user config is never applied (beyond what's sent with the DAG, if anything). So it seems like it's inconsistent between special cases (custom queue, doAs, reopen, pool session not open - the new session will be opened with the new conf from TezTask) and the normal case where the session will not get the new config, it will just come from the pool with whatever config it had. When the files are missing it looks like they are added without reopen (first in in TezTask::updateSession, and then eventually via calling addAppMasterLocalFiles in submit). Given that the normal case appears to work, I think the whole new config propagation is unnecessary (beyond doAs, queue, and file list verification). Does this make sense? cc [~hagleitn] [~vikram.dixit] [~sseth] I'm not sure who's more familiar with this logic :) was (Author: sershe): W.r.t. the logical problems (the other stuff I will address in due course) - looking at code in branch-2, before all the fancy changes, I see many paths where the user config would be applied; however, those are all for a new, or failed and reopened, session. The main pool use path goes like this: TezTask calls getSession with its unique config. That calls canWork..., that only checks doAs and queue name; then calls the private getSession/2. That again checks doAs and queue, also validates the config; however, if doAs and queue check out, it returns the session from the pool without ever using conf again. Then, the only way conf interacts with the session is if it's not open (shouldn't happen with pool sessions on main path), or to add resources. Adding resources passes the conf that's passed in from TezTask around, and doesn't interact with session config it seems. Only resources are changed. So, on this path the user config is never applied (beyond what's sent with the DAG, if anything). So it seems like it's inconsistent between special cases (custom queue, doAs, reopen, pool session not open - the new session will be opened with the new conf) and the normal case where the session will not get the new conf. When the files are missing it looks like they are added without reopen (first in in TezTask::updateSession, and then eventually via calling addAppMasterLocalFiles in submit). Given that the normal case appears to work, I think the whole new config propagation is unnecessary (beyond doAs, queue, and file list verification). Does this make sense? cc [~hagleitn] [~vikram.dixit] [~sseth] I'm not sure who's more familiar with this logic :) > change configuration handling in TezSessionState > > > Key: HIVE-17431 > URL: https://issues.apache.org/jira/browse/HIVE-17431 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17431.patch > > > The configuration is only set when opening the session; that seems > unnecessary - it could be set in the ctor and made final. E.g. when updating > the session and localizing new resources we may theoretically open the > session with a new config, but we don't update the config and only update the > files if the session is already open, which seems to imply that it's ok to > not update the config. > In most cases, the session is opened only once or reopened without intending > to change the config (e.g. if it times out). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17431) change configuration handling in TezSessionState
[ https://issues.apache.org/jira/browse/HIVE-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208230#comment-16208230 ] Sergey Shelukhin commented on HIVE-17431: - W.r.t. the logical problems (the other stuff I will address in due course) - looking at code in branch-2, before all the fancy changes, I see many paths where the user config would be applied; however, those are all for a new, or failed and reopened, session. The main pool use path goes like this: TezTask calls getSession with its unique config. That calls canWork..., that only checks doAs and queue name; then calls the private getSession/2. That again checks doAs and queue, also validates the config; however, if doAs and queue check out, it returns the session from the pool without ever using conf again. Then, the only way conf interacts with the session is if it's not open (shouldn't happen with pool sessions on main path), or to add resources. Adding resources passes the conf that's passed in from TezTask around, and doesn't interact with session config it seems. Only resources are changed. So, on this path the user config is never applied (beyond what's sent with the DAG, if anything). So it seems like it's inconsistent between special cases (custom queue, doAs, reopen, pool session not open - the new session will be opened with the new conf) and the normal case where the session will not get the new conf. When the files are missing it looks like they are added without reopen (first in in TezTask::updateSession, and then eventually via calling addAppMasterLocalFiles in submit). Given that the normal case appears to work, I think the whole new config propagation is unnecessary (beyond doAs, queue, and file list verification). Does this make sense? cc [~hagleitn] [~vikram.dixit] [~sseth] I'm not sure who's more familiar with this logic :) > change configuration handling in TezSessionState > > > Key: HIVE-17431 > URL: https://issues.apache.org/jira/browse/HIVE-17431 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17431.patch > > > The configuration is only set when opening the session; that seems > unnecessary - it could be set in the ctor and made final. E.g. when updating > the session and localizing new resources we may theoretically open the > session with a new config, but we don't update the config and only update the > files if the session is already open, which seems to imply that it's ok to > not update the config. > In most cases, the session is opened only once or reopened without intending > to change the config (e.g. if it times out). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17812) Move remaining classes that HiveMetaStore depends on
[ https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208183#comment-16208183 ] Hive QA commented on HIVE-17812: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892650/HIVE-17812.3.patch {color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11276 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7353/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7353/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7353/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892650 - PreCommit-HIVE-Build > Move remaining classes that HiveMetaStore depends on > - > > Key: HIVE-17812 > URL: https://issues.apache.org/jira/browse/HIVE-17812 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Labels: pull-request-available > Attachments: HIVE-17812.2.patch, HIVE-17812.3.patch, HIVE-17812.patch > > > There are several remaining pieces that need moved before we can move > HiveMetaStore itself. These include NotificationListener and > implementations, Events, AlterHandler, and a few other miscellaneous pieces. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17765) expose Hive keywords
[ https://issues.apache.org/jira/browse/HIVE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208170#comment-16208170 ] Sergey Shelukhin commented on HIVE-17765: - [~thejas] can you take a look at the updated patch? Failures are unrelated. > expose Hive keywords > - > > Key: HIVE-17765 > URL: https://issues.apache.org/jira/browse/HIVE-17765 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17765.01.patch, HIVE-17765.02.patch, > HIVE-17765.nogen.patch, HIVE-17765.patch > > > This could be useful e.g. for BI tools (via ODBC/JDBC drivers) to decide on > SQL capabilities of Hive -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17822) Provide an option to skip shading of jars
[ https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17822: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the review! > Provide an option to skip shading of jars > - > > Key: HIVE-17822 > URL: https://issues.apache.org/jira/browse/HIVE-17822 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 3.0.0 > > Attachments: HIVE-17822.1.patch > > > Maven shade plugin does not have option to skip. Adding it under a profile > can help with skip shade reducing build times. > Maven build profile shows druid and jdbc shade plugin to be slowest (also > hive-exec). For devs not working on druid or jdbc, it will be good to have an > option to skip shading via a profile. With this it will be possible to get a > subminute dev build. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-10924) add support for MERGE statement
[ https://issues.apache.org/jira/browse/HIVE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman resolved HIVE-10924. --- Resolution: Fixed Fix Version/s: 2.3.0 > add support for MERGE statement > --- > > Key: HIVE-10924 > URL: https://issues.apache.org/jira/browse/HIVE-10924 > Project: Hive > Issue Type: New Feature > Components: Query Planning, Query Processor, Transactions >Affects Versions: 1.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Fix For: 2.3.0 > > > add support for > https://issues.apache.org/jira/browse/HIVE-10924?jql=project%20%3D%20HIVE%20AND%20issuetype%20%3D%20%22New%20Feature%22%20AND%20text%20~%20merge# > MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15033) Ensure there is only 1 StatsTask in the query plan
[ https://issues.apache.org/jira/browse/HIVE-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-15033: -- Issue Type: Bug (was: Sub-task) Parent: (was: HIVE-10924) > Ensure there is only 1 StatsTask in the query plan > -- > > Key: HIVE-15033 > URL: https://issues.apache.org/jira/browse/HIVE-15033 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > currently there is 1 per WHEN clause -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15898) add Type2 SCD merge tests
[ https://issues.apache.org/jira/browse/HIVE-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-15898: -- Issue Type: Test (was: Sub-task) Parent: (was: HIVE-10924) > add Type2 SCD merge tests > - > > Key: HIVE-15898 > URL: https://issues.apache.org/jira/browse/HIVE-15898 > Project: Hive > Issue Type: Test > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-15898.01.patch, HIVE-15898.02.patch, > HIVE-15898.03.patch, HIVE-15898.04.patch, HIVE-15898.05.patch, > HIVE-15898.06.patch, HIVE-15898.07.patch, HIVE-15898.08.patch, > HIVE-15898.09.patch, HIVE-15898.10.patch, HIVE-15898.11.patch, > HIVE-15898.12.patch, HIVE-15898.13.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17061) Add Support for Column List in Insert Clause
[ https://issues.apache.org/jira/browse/HIVE-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17061: -- Issue Type: Improvement (was: Sub-task) Parent: (was: HIVE-10924) > Add Support for Column List in Insert Clause > > > Key: HIVE-17061 > URL: https://issues.apache.org/jira/browse/HIVE-17061 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Shawn Weeks >Priority: Minor > > Include support for a list of columns in the insert clause of the merge > statement. This helps when you may not know or care about the order of > columns in the target table or if you don't want to have to insert values > into all of the columns. > {code} > MERGE INTO target > USING source ON b = y > WHEN MATCHED AND c + 1 + z > 0 > THEN UPDATE SET a = 1, c = z > WHEN NOT MATCHED AND z IS NULL > THEN INSERT(a,b) VALUES(z, 7) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16722) Converting bucketed non-acid table to acid should perform validation
[ https://issues.apache.org/jira/browse/HIVE-16722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208110#comment-16208110 ] Eugene Koifman commented on HIVE-16722: --- no related failures [~alangates] could your review please > Converting bucketed non-acid table to acid should perform validation > > > Key: HIVE-16722 > URL: https://issues.apache.org/jira/browse/HIVE-16722 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-16722.01.patch, HIVE-16722.02.patch, > HIVE-16722.03.patch, HIVE-16722.WIP.patch > > > Converting a non acid table to acid only performs metadata validation (in > _TransactionalValidationListener_). > The data read code path only understands certain directory layouts and file > names and ignores (generally) files that don't match the expected format. > In Hive, directory layout and bucket file naming (especially older releases) > is poorly enforced. > Need to add a validation step on > {noformat} > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > to > scan the file system and report any possible data loss scenarios. > Currently Acid understands bucket files name like "0_0" and (with > HIVE-16177) 0_0_copy1" etc at the root of the partition. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17823) Fix subquery Qtest of Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208061#comment-16208061 ] Vineet Garg commented on HIVE-17823: +1 > Fix subquery Qtest of Hive on Spark > --- > > Key: HIVE-17823 > URL: https://issues.apache.org/jira/browse/HIVE-17823 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Dapeng Sun >Assignee: Dapeng Sun > Attachments: HIVE-17823.001.patch > > > The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 > introduced subquery fix. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17813) hive.exec.move.files.from.source.dir does not work with partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-17813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17813: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master > hive.exec.move.files.from.source.dir does not work with partitioned tables > -- > > Key: HIVE-17813 > URL: https://issues.apache.org/jira/browse/HIVE-17813 > Project: Hive > Issue Type: Bug >Reporter: Jason Dere >Assignee: Jason Dere > Fix For: 3.0.0 > > Attachments: HIVE-17813.1.patch > > > Setting hive.exec.move.files.from.source.dir=true causes data to not be moved > properly during inserts to partitioned tables. > Looks like the file path checking in Utilties.moveSpecifiedFiles() needs to > recursively check into directories. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16722) Converting bucketed non-acid table to acid should perform validation
[ https://issues.apache.org/jira/browse/HIVE-16722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208040#comment-16208040 ] Hive QA commented on HIVE-16722: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892643/HIVE-16722.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11275 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7352/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7352/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7352/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892643 - PreCommit-HIVE-Build > Converting bucketed non-acid table to acid should perform validation > > > Key: HIVE-16722 > URL: https://issues.apache.org/jira/browse/HIVE-16722 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-16722.01.patch, HIVE-16722.02.patch, > HIVE-16722.03.patch, HIVE-16722.WIP.patch > > > Converting a non acid table to acid only performs metadata validation (in > _TransactionalValidationListener_). > The data read code path only understands certain directory layouts and file > names and ignores (generally) files that don't match the expected format. > In Hive, directory layout and bucket file naming (especially older releases) > is poorly enforced. > Need to add a validation step on > {noformat} > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > to > scan the file system and report any possible data loss scenarios. > Currently Acid understands bucket files name like "0_0" and (with > HIVE-16177) 0_0_copy1" etc at the root of the partition. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17425) Change MetastoreConf.ConfVars internal members to be private
[ https://issues.apache.org/jira/browse/HIVE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17425: -- Status: Patch Available (was: Open) > Change MetastoreConf.ConfVars internal members to be private > > > Key: HIVE-17425 > URL: https://issues.apache.org/jira/browse/HIVE-17425 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17425.2.patch, HIVE-17425.patch > > > MetastoreConf's dual use of metastore keys and Hive keys is causing confusion > for developers. We should make the relevant members private and provide > getter methods with comments on when it is appropriate to use them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17425) Change MetastoreConf.ConfVars internal members to be private
[ https://issues.apache.org/jira/browse/HIVE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17425: -- Attachment: HIVE-17425.2.patch Rebased version of the patch to bring it up to date. > Change MetastoreConf.ConfVars internal members to be private > > > Key: HIVE-17425 > URL: https://issues.apache.org/jira/browse/HIVE-17425 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17425.2.patch, HIVE-17425.patch > > > MetastoreConf's dual use of metastore keys and Hive keys is causing confusion > for developers. We should make the relevant members private and provide > getter methods with comments on when it is appropriate to use them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17425) Change MetastoreConf.ConfVars internal members to be private
[ https://issues.apache.org/jira/browse/HIVE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17425: -- Status: Open (was: Patch Available) > Change MetastoreConf.ConfVars internal members to be private > > > Key: HIVE-17425 > URL: https://issues.apache.org/jira/browse/HIVE-17425 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17425.patch > > > MetastoreConf's dual use of metastore keys and Hive keys is causing confusion > for developers. We should make the relevant members private and provide > getter methods with comments on when it is appropriate to use them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17804) Vectorization: Bug erroneously causes match for 1st row in batch (SelectStringColLikeStringScalar)
[ https://issues.apache.org/jira/browse/HIVE-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-17804: Status: Patch Available (was: In Progress) > Vectorization: Bug erroneously causes match for 1st row in batch > (SelectStringColLikeStringScalar) > -- > > Key: HIVE-17804 > URL: https://issues.apache.org/jira/browse/HIVE-17804 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17804.01.patch, HIVE-17804.02.patch > > > Code setting output value to LongColumnVector.NULL_VALUE for null candidate > sets the 0th entry instead of the i'th. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17804) Vectorization: Bug erroneously causes match for 1st row in batch (SelectStringColLikeStringScalar)
[ https://issues.apache.org/jira/browse/HIVE-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-17804: Attachment: HIVE-17804.02.patch > Vectorization: Bug erroneously causes match for 1st row in batch > (SelectStringColLikeStringScalar) > -- > > Key: HIVE-17804 > URL: https://issues.apache.org/jira/browse/HIVE-17804 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17804.01.patch, HIVE-17804.02.patch > > > Code setting output value to LongColumnVector.NULL_VALUE for null candidate > sets the 0th entry instead of the i'th. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17804) Vectorization: Bug erroneously causes match for 1st row in batch (SelectStringColLikeStringScalar)
[ https://issues.apache.org/jira/browse/HIVE-17804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-17804: Status: In Progress (was: Patch Available) > Vectorization: Bug erroneously causes match for 1st row in batch > (SelectStringColLikeStringScalar) > -- > > Key: HIVE-17804 > URL: https://issues.apache.org/jira/browse/HIVE-17804 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17804.01.patch > > > Code setting output value to LongColumnVector.NULL_VALUE for null candidate > sets the 0th entry instead of the i'th. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17425) Change MetastoreConf.ConfVars internal members to be private
[ https://issues.apache.org/jira/browse/HIVE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207998#comment-16207998 ] Alan Gates commented on HIVE-17425: --- Except for MetastoreConf tests, this set of calls in Metrics is the only place it reaches under and uses conf.get rather than MetastoreConf.getVar. The reason it does this is to avoid getting the default value (as noted in the comments). In this case I don't want all the magic around checking various options for MetastoreConf and HiveConf and defaults. It seems better to have this one exception rather than add methods to MetastoreConf that could do this automatically but that would confuse other developers as to which MetastoreConf method they should be using. > Change MetastoreConf.ConfVars internal members to be private > > > Key: HIVE-17425 > URL: https://issues.apache.org/jira/browse/HIVE-17425 > Project: Hive > Issue Type: Task > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17425.patch > > > MetastoreConf's dual use of metastore keys and Hive keys is causing confusion > for developers. We should make the relevant members private and provide > getter methods with comments on when it is appropriate to use them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17802) Remove unnecessary calls to FileSystem.setOwner() from FileOutputCommitterContainer
[ https://issues.apache.org/jira/browse/HIVE-17802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207980#comment-16207980 ] Mithun Radhakrishnan commented on HIVE-17802: - Back to the drawing board. :/ I'll check {{TestHCatMultiOutputFormat}} and {{TestHCatOutputFormat}}. > Remove unnecessary calls to FileSystem.setOwner() from > FileOutputCommitterContainer > --- > > Key: HIVE-17802 > URL: https://issues.apache.org/jira/browse/HIVE-17802 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Chris Drome > Attachments: HIVE-17802.1.patch > > > For large Pig/HCat queries that produce a large number of > partitions/directories/files, we have seen cases where the HDFS NameNode > groaned under the weight of {{FileSystem.setOwner()}} calls, originating from > the commit-step. This was the result of the following code in > FileOutputCommitterContainer: > {code:java} > private void applyGroupAndPerms(FileSystem fs, Path dir, FsPermission > permission, > List acls, String group, boolean recursive) > throws IOException { > ... > if (recursive) { > for (FileStatus fileStatus : fs.listStatus(dir)) { > if (fileStatus.isDir()) { > applyGroupAndPerms(fs, fileStatus.getPath(), permission, acls, > group, true); > } else { > fs.setPermission(fileStatus.getPath(), permission); > chown(fs, fileStatus.getPath(), group); > } > } > } > } > private void chown(FileSystem fs, Path file, String group) throws > IOException { > try { > fs.setOwner(file, null, group); > } catch (AccessControlException ignore) { > // Some users have wrong table group, ignore it. > LOG.warn("Failed to change group of partition directories/files: " + > file, ignore); > } > } > {code} > One call per file/directory is far too many. We have a patch that reduces the > namenode pressure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17812) Move remaining classes that HiveMetaStore depends on
[ https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17812: -- Status: Patch Available (was: Open) > Move remaining classes that HiveMetaStore depends on > - > > Key: HIVE-17812 > URL: https://issues.apache.org/jira/browse/HIVE-17812 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Labels: pull-request-available > Attachments: HIVE-17812.2.patch, HIVE-17812.3.patch, HIVE-17812.patch > > > There are several remaining pieces that need moved before we can move > HiveMetaStore itself. These include NotificationListener and > implementations, Events, AlterHandler, and a few other miscellaneous pieces. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17812) Move remaining classes that HiveMetaStore depends on
[ https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17812: -- Attachment: HIVE-17812.3.patch Rebased version of the patch. > Move remaining classes that HiveMetaStore depends on > - > > Key: HIVE-17812 > URL: https://issues.apache.org/jira/browse/HIVE-17812 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Labels: pull-request-available > Attachments: HIVE-17812.2.patch, HIVE-17812.3.patch, HIVE-17812.patch > > > There are several remaining pieces that need moved before we can move > HiveMetaStore itself. These include NotificationListener and > implementations, Events, AlterHandler, and a few other miscellaneous pieces. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17812) Move remaining classes that HiveMetaStore depends on
[ https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17812: -- Status: Open (was: Patch Available) > Move remaining classes that HiveMetaStore depends on > - > > Key: HIVE-17812 > URL: https://issues.apache.org/jira/browse/HIVE-17812 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Labels: pull-request-available > Attachments: HIVE-17812.2.patch, HIVE-17812.patch > > > There are several remaining pieces that need moved before we can move > HiveMetaStore itself. These include NotificationListener and > implementations, Events, AlterHandler, and a few other miscellaneous pieces. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17506) Fix standalone-metastore pom.xml to not depend on hive's main pom
[ https://issues.apache.org/jira/browse/HIVE-17506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17506: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) > Fix standalone-metastore pom.xml to not depend on hive's main pom > - > > Key: HIVE-17506 > URL: https://issues.apache.org/jira/browse/HIVE-17506 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 3.0.0 > > Attachments: HIVE-17506.2.patch, HIVE-17506.3.patch, HIVE-17506.patch > > > In order to be separately releasable the standalone metastore needs to have > its own pom rather than inherit from Hive's. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17506) Fix standalone-metastore pom.xml to not depend on hive's main pom
[ https://issues.apache.org/jira/browse/HIVE-17506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17506: -- Attachment: HIVE-17506.3.patch Final version of the patch that I checked in. > Fix standalone-metastore pom.xml to not depend on hive's main pom > - > > Key: HIVE-17506 > URL: https://issues.apache.org/jira/browse/HIVE-17506 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 3.0.0 > > Attachments: HIVE-17506.2.patch, HIVE-17506.3.patch, HIVE-17506.patch > > > In order to be separately releasable the standalone metastore needs to have > its own pom rather than inherit from Hive's. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17506) Fix standalone-metastore pom.xml to not depend on hive's main pom
[ https://issues.apache.org/jira/browse/HIVE-17506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207924#comment-16207924 ] ASF GitHub Bot commented on HIVE-17506: --- Github user asfgit closed the pull request at: https://github.com/apache/hive/pull/247 > Fix standalone-metastore pom.xml to not depend on hive's main pom > - > > Key: HIVE-17506 > URL: https://issues.apache.org/jira/browse/HIVE-17506 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 3.0.0 > > Attachments: HIVE-17506.2.patch, HIVE-17506.3.patch, HIVE-17506.patch > > > In order to be separately releasable the standalone metastore needs to have > its own pom rather than inherit from Hive's. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16722) Converting bucketed non-acid table to acid should perform validation
[ https://issues.apache.org/jira/browse/HIVE-16722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16722: -- Attachment: HIVE-16722.03.patch > Converting bucketed non-acid table to acid should perform validation > > > Key: HIVE-16722 > URL: https://issues.apache.org/jira/browse/HIVE-16722 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-16722.01.patch, HIVE-16722.02.patch, > HIVE-16722.03.patch, HIVE-16722.WIP.patch > > > Converting a non acid table to acid only performs metadata validation (in > _TransactionalValidationListener_). > The data read code path only understands certain directory layouts and file > names and ignores (generally) files that don't match the expected format. > In Hive, directory layout and bucket file naming (especially older releases) > is poorly enforced. > Need to add a validation step on > {noformat} > alter table T SET TBLPROPERTIES ('transactional'='true') > {noformat} > to > scan the file system and report any possible data loss scenarios. > Currently Acid understands bucket files name like "0_0" and (with > HIVE-16177) 0_0_copy1" etc at the root of the partition. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16669) Fine tune Compaction to take advantage of Acid 2.0
[ https://issues.apache.org/jira/browse/HIVE-16669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16669: -- Attachment: HIVE-16669.wip.patch HIVE-16669.wip.patch has tests and some notes > Fine tune Compaction to take advantage of Acid 2.0 > -- > > Key: HIVE-16669 > URL: https://issues.apache.org/jira/browse/HIVE-16669 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16669.wip.patch > > > * There is little point using 2.0 vectorized reader since there is no > operator pipeline in compaction > * If minor compaction just concats delete_delta files together, then the 2 > stage compaction should always ensure that we have a limited number of Orc > readers to do the merging and current OrcRawRecordMerger should be fine > * ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17800) input_part6.q wants to test partition pruning, but tests expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-17800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207864#comment-16207864 ] Barna Zsombor Klara commented on HIVE-17800: +1 > input_part6.q wants to test partition pruning, but tests expression evaluation > -- > > Key: HIVE-17800 > URL: https://issues.apache.org/jira/browse/HIVE-17800 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-17800.patch > > > input_part6.q looks like this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = 2008-04-08 LIMIT 10; > {code} > The intended test most probably is this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = "2008-04-08" LIMIT 10; > {code} > Currently we evaluete 2008-4-8 to 1996: > {code} > predicate: (UDFToDouble(ds) = 1996.0) (type: boolean) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17800) input_part6.q wants to test partition pruning, but tests expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-17800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207854#comment-16207854 ] Peter Vary commented on HIVE-17800: --- Oh, and the failures are not related :( > input_part6.q wants to test partition pruning, but tests expression evaluation > -- > > Key: HIVE-17800 > URL: https://issues.apache.org/jira/browse/HIVE-17800 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-17800.patch > > > input_part6.q looks like this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = 2008-04-08 LIMIT 10; > {code} > The intended test most probably is this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = "2008-04-08" LIMIT 10; > {code} > Currently we evaluete 2008-4-8 to 1996: > {code} > predicate: (UDFToDouble(ds) = 1996.0) (type: boolean) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17800) input_part6.q wants to test partition pruning, but tests expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-17800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207852#comment-16207852 ] Peter Vary commented on HIVE-17800: --- [~zsombor.klara] or [~kgyrtkirk]: could you review please? Thanks, Peter > input_part6.q wants to test partition pruning, but tests expression evaluation > -- > > Key: HIVE-17800 > URL: https://issues.apache.org/jira/browse/HIVE-17800 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-17800.patch > > > input_part6.q looks like this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = 2008-04-08 LIMIT 10; > {code} > The intended test most probably is this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = "2008-04-08" LIMIT 10; > {code} > Currently we evaluete 2008-4-8 to 1996: > {code} > predicate: (UDFToDouble(ds) = 1996.0) (type: boolean) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17800) input_part6.q wants to test partition pruning, but tests expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-17800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207848#comment-16207848 ] Hive QA commented on HIVE-17800: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892633/HIVE-17800.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 11275 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=101) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7351/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7351/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7351/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892633 - PreCommit-HIVE-Build > input_part6.q wants to test partition pruning, but tests expression evaluation > -- > > Key: HIVE-17800 > URL: https://issues.apache.org/jira/browse/HIVE-17800 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-17800.patch > > > input_part6.q looks like this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = 2008-04-08 LIMIT 10; > {code} > The intended test most probably is this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = "2008-04-08" LIMIT 10; > {code} > Currently we evaluete 2008-4-8 to 1996: > {code} > predicate: (UDFToDouble(ds) = 1996.0) (type: boolean) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17822) Provide an option to skip shading of jars
[ https://issues.apache.org/jira/browse/HIVE-17822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207813#comment-16207813 ] Ashutosh Chauhan commented on HIVE-17822: - +1 > Provide an option to skip shading of jars > - > > Key: HIVE-17822 > URL: https://issues.apache.org/jira/browse/HIVE-17822 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17822.1.patch > > > Maven shade plugin does not have option to skip. Adding it under a profile > can help with skip shade reducing build times. > Maven build profile shows druid and jdbc shade plugin to be slowest (also > hive-exec). For devs not working on druid or jdbc, it will be good to have an > option to skip shading via a profile. With this it will be possible to get a > subminute dev build. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17821) TxnHandler.enqueueLockWithRetry() should not write TXN_COMPONENTS if partName=null and table is partitioned
[ https://issues.apache.org/jira/browse/HIVE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207811#comment-16207811 ] Eugene Koifman commented on HIVE-17821: --- Thanks Lefty > TxnHandler.enqueueLockWithRetry() should not write TXN_COMPONENTS if > partName=null and table is partitioned > --- > > Key: HIVE-17821 > URL: https://issues.apache.org/jira/browse/HIVE-17821 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Minor > > LM may acquire read locks on the table when writing a partition. > There is no need to make an entry for the table if we know it's partitioned > since any I/U/D must affect a partition (or set of). > Pass isPartitioned() in LockComponent/LockRequest or look up in TxnHandler -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17821) TxnHandler.enqueueLockWithRetry() should not write TXN_COMPONENTS if partName=null and table is partitioned
[ https://issues.apache.org/jira/browse/HIVE-17821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17821: -- Summary: TxnHandler.enqueueLockWithRetry() should not write TXN_COMPONENTS if partName=null and table is partitioned (was: TxnHandler.enqueueLockWithRetry() should now write TXN_COMPONENTS if partName=null and table is partitioned) > TxnHandler.enqueueLockWithRetry() should not write TXN_COMPONENTS if > partName=null and table is partitioned > --- > > Key: HIVE-17821 > URL: https://issues.apache.org/jira/browse/HIVE-17821 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Minor > > LM may acquire read locks on the table when writing a partition. > There is no need to make an entry for the table if we know it's partitioned > since any I/U/D must affect a partition (or set of). > Pass isPartitioned() in LockComponent/LockRequest or look up in TxnHandler -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16748) Integreate YETUS to Pre-Commit
[ https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207759#comment-16207759 ] Hive QA commented on HIVE-16748: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892602/dummytest.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/Precommit-HIVE-Build/104/testReport Console output: https://builds.apache.org/job/Precommit-HIVE-Build/104/console Test logs: http://35.199.162.129/logsPrecommit-HIVE-Build-104/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-10-17 15:12:02.615 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + MAVEN_OPTS='-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/Precommit-HIVE-Build-104/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-10-17 15:12:02.618 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git clean -f -d Removing standalone-metastore/src/gen/org/ + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-10-17 15:12:03.273 + rm -rf ../yetus + mkdir ../yetus + cp -R accumulo-handler beeline bin binary-package-licenses checkstyle cli common conf contrib data datanucleus.log dev-support docs druid-handler errata.txt findbugs hbase-handler hcatalog hplsql itests jdbc jdbc-handler lib LICENSE llap-client llap-common llap-ext-client llap-server llap-tez metastore NOTICE packaging pom.xml ql README.md RELEASE_NOTES.txt serde service service-rpc shims spark-client standalone-metastore storage-api target testutils vector-code-gen ../yetus/ + mkdir /data/hiveptest/logs/Precommit-HIVE-Build-104/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p1 patching file testutils/ptest2/src/main/java/org/apache/hive/ptest/api/server/ExecutionController.java + [[ maven == \m\a\v\e\n ]] + rm -rf /data/hiveptest/working/maven/org/apache/hive + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven protoc-jar: protoc version: 250, detected platform: linux/amd64 protoc-jar: executing: [/tmp/protoc3009861101668651199.exe, -I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore, --java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources, /data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto] ANTLR Parser Generator Version 3.5.2 Output file /data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources/antlr3/org/apache/hadoop/hive/metastore/parser/FilterParser.java does not exist: must build /data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g org/apache/hadoop/hive/metastore/parser/Filter.g DataNucleus Enhancer (version 4.1.17) for API "JDO" DataNucleus Enhancer : Classpath >> /usr/share/maven/boot/plexus-classworlds-2.x.jar ENHANCED (Persistable) : org.apache.hadoop.hive.metastor
[jira] [Updated] (HIVE-17800) input_part6.q wants to test partition pruning, but tests expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-17800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-17800: -- Status: Patch Available (was: Open) > input_part6.q wants to test partition pruning, but tests expression evaluation > -- > > Key: HIVE-17800 > URL: https://issues.apache.org/jira/browse/HIVE-17800 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-17800.patch > > > input_part6.q looks like this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = 2008-04-08 LIMIT 10; > {code} > The intended test most probably is this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = "2008-04-08" LIMIT 10; > {code} > Currently we evaluete 2008-4-8 to 1996: > {code} > predicate: (UDFToDouble(ds) = 1996.0) (type: boolean) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17800) input_part6.q wants to test partition pruning, but tests expression evaluation
[ https://issues.apache.org/jira/browse/HIVE-17800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-17800: -- Attachment: HIVE-17800.patch > input_part6.q wants to test partition pruning, but tests expression evaluation > -- > > Key: HIVE-17800 > URL: https://issues.apache.org/jira/browse/HIVE-17800 > Project: Hive > Issue Type: Bug >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-17800.patch > > > input_part6.q looks like this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = 2008-04-08 LIMIT 10; > {code} > The intended test most probably is this: > {code} > EXPLAIN > SELECT x.* FROM SRCPART x WHERE x.ds = "2008-04-08" LIMIT 10; > {code} > Currently we evaluete 2008-4-8 to 1996: > {code} > predicate: (UDFToDouble(ds) = 1996.0) (type: boolean) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16748) Integreate YETUS to Pre-Commit
[ https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207737#comment-16207737 ] Hive QA commented on HIVE-16748: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892602/dummytest.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/Precommit-HIVE-Build/103/testReport Console output: https://builds.apache.org/job/Precommit-HIVE-Build/103/console Test logs: http://35.199.162.129/logsPrecommit-HIVE-Build-103/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-10-17 14:46:39.851 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + MAVEN_OPTS='-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/Precommit-HIVE-Build-103/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-10-17 14:46:39.854 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-10-17 14:46:40.329 + rm -rf ../yetus + mkdir ../yetus + cp -R accumulo-handler beeline bin binary-package-licenses checkstyle cli common conf contrib data dev-support docs druid-handler errata.txt findbugs hbase-handler hcatalog hplsql itests jdbc jdbc-handler lib LICENSE llap-client llap-common llap-ext-client llap-server llap-tez metastore NOTICE packaging pom.xml ql README.md RELEASE_NOTES.txt serde service service-rpc shims spark-client standalone-metastore storage-api testutils vector-code-gen ../yetus/ + mkdir /data/hiveptest/logs/Precommit-HIVE-Build-103/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p1 patching file testutils/ptest2/src/main/java/org/apache/hive/ptest/api/server/ExecutionController.java + [[ maven == \m\a\v\e\n ]] + rm -rf /data/hiveptest/working/maven/org/apache/hive + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven [ERROR] Failed to execute goal on project hive-standalone-metastore: Could not resolve dependencies for project org.apache.hive:hive-standalone-metastore:jar:3.0.0-SNAPSHOT: Failure to transfer org.apache.hadoop:hadoop-distcp:jar:2.8.1 from http://www.datanucleus.org/downloads/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of datanucleus has elapsed or updates are forced. Original error: Could not transfer artifact org.apache.hadoop:hadoop-distcp:jar:2.8.1 from/to datanucleus (http://www.datanucleus.org/downloads/maven2): Connect to localhost:3128 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused (Connection refused) -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the comma
[jira] [Commented] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine
[ https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207728#comment-16207728 ] Hive QA commented on HIVE-17433: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892567/HIVE-17433.04.patch {color:green}SUCCESS:{color} +1 due to 20 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 211 failed/errored test(s), 11154 tests executed *Failed tests:* {noformat} TestConstantVectorExpression - did not produce a TEST-*.xml file (likely timed out) (batchId=277) TestVectorDateExpressions - did not produce a TEST-*.xml file (likely timed out) (batchId=275) TestVectorFilterExpressions - did not produce a TEST-*.xml file (likely timed out) (batchId=275) TestVectorGenericDateExpressions - did not produce a TEST-*.xml file (likely timed out) (batchId=275) TestVectorLogicalExpressions - did not produce a TEST-*.xml file (likely timed out) (batchId=275) TestVectorTimestampExpressions - did not produce a TEST-*.xml file (likely timed out) (batchId=276) TestVectorTypeCasts - did not produce a TEST-*.xml file (likely timed out) (batchId=275) TestVectorizationContext - did not produce a TEST-*.xml file (likely timed out) (batchId=275) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mergejoin] (batchId=58) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_no_row_serde] (batchId=67) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_reference_windowed] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_casts] (batchId=80) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_partitioned] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_vector_nohybridgrace] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_distinct_gby] (batchId=162) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=171) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_inner_join] (batchId=173) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join0] (batchId=173) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join2] (batchId=171) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=100) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vector_non_string_partition] (batchId=100) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_div0] (batchId=101) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit] (batchId=100) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=127) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_cast_constant] (batchId=106) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_char_4] (batchId=141) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct] (batchId=113) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_data_types] (batchId=136) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_decimal_aggregate] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_decimal_mapjoin] (batchId=126) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_distinct_2] (batchId=124) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_elt] (batchId=117) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_groupby_3] (batchId=130) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_left_outer_join] (batchId=112) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_mapjoin_reduce] (batchId=137) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_orderby_5] (batchId=120) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_string_concat] (batchId=116) org.apache.hadoop.hive.cli.TestSparkCliDriver.
[jira] [Commented] (HIVE-16748) Integreate YETUS to Pre-Commit
[ https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207701#comment-16207701 ] Hive QA commented on HIVE-16748: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892602/dummytest.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/Precommit-HIVE-Build/102/testReport Console output: https://builds.apache.org/job/Precommit-HIVE-Build/102/console Test logs: http://35.199.162.129/logsPrecommit-HIVE-Build-102/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-10-17 14:24:42.984 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + MAVEN_OPTS='-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/Precommit-HIVE-Build-102/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-10-17 14:24:42.986 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-10-17 14:24:43.494 + mkdir ../yetus mkdir: cannot create directory ?../yetus?: File exists + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12892602 - Precommit-HIVE-Build > Integreate YETUS to Pre-Commit > -- > > Key: HIVE-16748 > URL: https://issues.apache.org/jira/browse/HIVE-16748 > Project: Hive > Issue Type: Sub-task >Reporter: Peter Vary >Assignee: Adam Szita > Attachments: dummytest.patch > > > After HIVE-15051, we should automate the yetus run for the Pre-Commit tests, > so the results are added in comments like > https://issues.apache.org/jira/browse/YARN-6363?focusedCommentId=15937570&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15937570 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16748) Integreate YETUS to Pre-Commit
[ https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207681#comment-16207681 ] Hive QA commented on HIVE-16748: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892602/dummytest.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/Precommit-HIVE-Build/101/testReport Console output: https://builds.apache.org/job/Precommit-HIVE-Build/101/console Test logs: http://35.199.162.129/logsPrecommit-HIVE-Build-101/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-10-17 14:02:40.134 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + MAVEN_OPTS='-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/Precommit-HIVE-Build-101/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source ]] + git clone https://github.com/apache/hive.git apache-github-source-source Cloning into 'apache-github-source-source'... + date '+%Y-%m-%d %T.%3N' 2017-10-17 14:03:09.467 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-10-17 14:03:10.486 + mkdir ../yetus + cp -f . ../yetus cp: omitting directory ?.? + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12892602 - Precommit-HIVE-Build > Integreate YETUS to Pre-Commit > -- > > Key: HIVE-16748 > URL: https://issues.apache.org/jira/browse/HIVE-16748 > Project: Hive > Issue Type: Sub-task >Reporter: Peter Vary >Assignee: Adam Szita > Attachments: dummytest.patch > > > After HIVE-15051, we should automate the yetus run for the Pre-Commit tests, > so the results are added in comments like > https://issues.apache.org/jira/browse/YARN-6363?focusedCommentId=15937570&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15937570 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Issue Comment Deleted] (HIVE-14867) "serialization.last.column.takes.rest" does not work for MultiDelimitSerDe
[ https://issues.apache.org/jira/browse/HIVE-14867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Weeks updated HIVE-14867: --- Comment: was deleted (was: Ran across this issue troubleshooting for a customer. This essentially makes this serde useless as it's always going to throw garbage in the last column. Is there a reason we can't just add multi character field delimiters to other text serde and deprecate this one as it doesn't appear to be getting maintained.) > "serialization.last.column.takes.rest" does not work for MultiDelimitSerDe > -- > > Key: HIVE-14867 > URL: https://issues.apache.org/jira/browse/HIVE-14867 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.3.0 >Reporter: Niklaus Xiao >Assignee: Niklaus Xiao > > Create table with MultiDelimitSerde: > {code} > CREATE TABLE foo (a string, b string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH > SERDEPROPERTIES > ("field.delim"="|@|","collection.delim"=":","mapkey.delim"="@") stored as > textfile; > {code} > load data into table: > {code} > 1|@|Lily|@|HW|@|abc > 2|@|Lucy|@|LX|@|123 > 3|@|Lilei|@|XX|@|3434 > {code} > select data from this table: > {code} > select * from foo; > +-++--+ > | foo.a | foo.b | > +-++--+ > | 1 | Lily^AHW^Aabc| > | 2 | Lucy^ALX^A123| > | 3 | Lilei^AXX^A3434 | > +-++--+ > 3 rows selected (0.905 seconds) > {code} > You can see the last column takes all the data, and replace the delimiter to > default ^A. > lastColumnTakesRestString should be false by default: > {code} > String lastColumnTakesRestString = tbl > .getProperty(serdeConstants.SERIALIZATION_LAST_COLUMN_TAKES_REST); > lastColumnTakesRest = (lastColumnTakesRestString != null && > lastColumnTakesRestString > .equalsIgnoreCase("true")); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16748) Integreate YETUS to Pre-Commit
[ https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207647#comment-16207647 ] Hive QA commented on HIVE-16748: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892602/dummytest.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/Precommit-HIVE-Build/100/testReport Console output: https://builds.apache.org/job/Precommit-HIVE-Build/100/console Test logs: http://35.199.162.129/logsPrecommit-HIVE-Build-100/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/master-working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-10-17 13:33:56.995 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + MAVEN_OPTS='-Xmx1g -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hiveptest/master-working/ + tee /data/hiveptest/logs/Precommit-HIVE-Build-100/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source ]] + git clone https://github.com/apache/hive.git apache-github-source-source Cloning into 'apache-github-source-source'... + date '+%Y-%m-%d %T.%3N' 2017-10-17 13:34:23.075 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 7d6a511 HIVE-17508: Implement global execution triggers based on counters (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-10-17 13:34:23.885 + patchCommandPath=/data/hiveptest/master-working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/master-working/scratch/build.patch + [[ -f /data/hiveptest/master-working/scratch/build.patch ]] + chmod +x /data/hiveptest/master-working/scratch/smart-apply-patch.sh + /data/hiveptest/master-working/scratch/smart-apply-patch.sh /data/hiveptest/master-working/scratch/build.patch Going to apply patch with: patch -p1 patching file testutils/ptest2/src/main/java/org/apache/hive/ptest/api/server/ExecutionController.java + [[ maven == \m\a\v\e\n ]] + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/master-working/maven protoc-jar: protoc version: 250, detected platform: linux/amd64 protoc-jar: executing: [/tmp/protoc2185686001454303443.exe, -I/data/hiveptest/master-working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore, --java_out=/data/hiveptest/master-working/apache-github-source-source/standalone-metastore/target/generated-sources, /data/hiveptest/master-working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto] ANTLR Parser Generator Version 3.5.2 Output file /data/hiveptest/master-working/apache-github-source-source/standalone-metastore/target/generated-sources/antlr3/org/apache/hadoop/hive/metastore/parser/FilterParser.java does not exist: must build /data/hiveptest/master-working/apache-github-source-source/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g org/apache/hadoop/hive/metastore/parser/Filter.g [ERROR] Failed to execute goal on project hive-shims-0.23: Could not resolve dependencies for project org.apache.hive.shims:hive-shims-0.23:jar:3.0.0-SNAPSHOT: The following artifacts could not be resolved: org.javassist:javassist:jar:3.18.1-GA, org.apache.hadoop:hadoop-yarn-server-tests:jar:tests:2.8.1: Failure to transfer org.javassist:javassist:jar:3.18.1-GA from http://www.datanucleus.org/downloads/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of datanucleus has elapsed or updates are forced. Original error: Could not transfer artifact org.javassist:javassist:jar:3.18.1-GA from/to datanucleus (http://www.datanucleus.org/dow
[jira] [Commented] (HIVE-17817) Stabilize crossproduct warning message output order
[ https://issues.apache.org/jira/browse/HIVE-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207643#comment-16207643 ] Hive QA commented on HIVE-17817: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892566/HIVE-17817.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11275 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7349/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7349/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7349/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892566 - PreCommit-HIVE-Build > Stabilize crossproduct warning message output order > --- > > Key: HIVE-17817 > URL: https://issues.apache.org/jira/browse/HIVE-17817 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17817.01.patch > > > {{CrossProductCheck}} warning printout sometimes happens in reverse order; > which reduces people's confidence in the test's reliability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16748) Integreate YETUS to Pre-Commit
[ https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-16748: -- Attachment: dummytest.patch > Integreate YETUS to Pre-Commit > -- > > Key: HIVE-16748 > URL: https://issues.apache.org/jira/browse/HIVE-16748 > Project: Hive > Issue Type: Sub-task >Reporter: Peter Vary >Assignee: Adam Szita > Attachments: dummytest.patch > > > After HIVE-15051, we should automate the yetus run for the Pre-Commit tests, > so the results are added in comments like > https://issues.apache.org/jira/browse/YARN-6363?focusedCommentId=15937570&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15937570 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-16748) Integreate YETUS to Pre-Commit
[ https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita reassigned HIVE-16748: - Assignee: Adam Szita (was: Peter Vary) > Integreate YETUS to Pre-Commit > -- > > Key: HIVE-16748 > URL: https://issues.apache.org/jira/browse/HIVE-16748 > Project: Hive > Issue Type: Sub-task >Reporter: Peter Vary >Assignee: Adam Szita > > After HIVE-15051, we should automate the yetus run for the Pre-Commit tests, > so the results are added in comments like > https://issues.apache.org/jira/browse/YARN-6363?focusedCommentId=15937570&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15937570 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14867) "serialization.last.column.takes.rest" does not work for MultiDelimitSerDe
[ https://issues.apache.org/jira/browse/HIVE-14867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207575#comment-16207575 ] Shawn Weeks commented on HIVE-14867: Ran across this issue troubleshooting for a customer. This essentially makes this serde useless as it's always going to throw garbage in the last column. Is there a reason we can't just add multi character field delimiters to other text serde and deprecate this one as it doesn't appear to be getting maintained. > "serialization.last.column.takes.rest" does not work for MultiDelimitSerDe > -- > > Key: HIVE-14867 > URL: https://issues.apache.org/jira/browse/HIVE-14867 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 1.3.0 >Reporter: Niklaus Xiao >Assignee: Niklaus Xiao > > Create table with MultiDelimitSerde: > {code} > CREATE TABLE foo (a string, b string) ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH > SERDEPROPERTIES > ("field.delim"="|@|","collection.delim"=":","mapkey.delim"="@") stored as > textfile; > {code} > load data into table: > {code} > 1|@|Lily|@|HW|@|abc > 2|@|Lucy|@|LX|@|123 > 3|@|Lilei|@|XX|@|3434 > {code} > select data from this table: > {code} > select * from foo; > +-++--+ > | foo.a | foo.b | > +-++--+ > | 1 | Lily^AHW^Aabc| > | 2 | Lucy^ALX^A123| > | 3 | Lilei^AXX^A3434 | > +-++--+ > 3 rows selected (0.905 seconds) > {code} > You can see the last column takes all the data, and replace the delimiter to > default ^A. > lastColumnTakesRestString should be false by default: > {code} > String lastColumnTakesRestString = tbl > .getProperty(serdeConstants.SERIALIZATION_LAST_COLUMN_TAKES_REST); > lastColumnTakesRest = (lastColumnTakesRestString != null && > lastColumnTakesRestString > .equalsIgnoreCase("true")); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17798) When replacing the src table names in BeeLine testing, the table names shouldn't be changed to lower case
[ https://issues.apache.org/jira/browse/HIVE-17798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207569#comment-16207569 ] Marta Kuczora commented on HIVE-17798: -- Thanks a lot [~pvary] for committing the patch. > When replacing the src table names in BeeLine testing, the table names > shouldn't be changed to lower case > - > > Key: HIVE-17798 > URL: https://issues.apache.org/jira/browse/HIVE-17798 > Project: Hive > Issue Type: Bug > Components: Testing Infrastructure >Affects Versions: 3.0.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-17798.1.patch > > > When running the q tests with BeeLine, the name of the src tables are changed > in all queries to have the database name as prefix, like src -> default.src, > srcpart -> default.srcpart > This renaming mechanism changes the table names to lower case. For example > the query "SELECT * FROM SRC" will be "SELECT * FROM src" after the rewite. > This will cause failure during the comparison of the out files. > Change the QFile.replaceTableNames method to keep the upper case letters > unchanged. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17787) Apply more filters on the BeeLine test output files (follow-up on HIVE-17569)
[ https://issues.apache.org/jira/browse/HIVE-17787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207570#comment-16207570 ] Marta Kuczora commented on HIVE-17787: -- Thanks a lot [~pvary] for committing the patch. > Apply more filters on the BeeLine test output files (follow-up on HIVE-17569) > - > > Key: HIVE-17787 > URL: https://issues.apache.org/jira/browse/HIVE-17787 > Project: Hive > Issue Type: Improvement > Components: Testing Infrastructure >Affects Versions: 3.0.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-17787.1.patch, HIVE-17787.2.patch > > > When running the q tests with BeeLine, some known differences came up which > should be filtered out if the "test.beeline.compare.portable" parameter is > set to true. > The result of the following commands can be different when running them via > BeeLine then in the golden out file: > - DESCRIBE > - SHOW TABLES > - SHOW FORMATTED TABLES > - SHOW DATABASES TABLES > Also the join warnings and the mapreduce jobtracker address can be different > so it would make sense to filter them out. > For example: > {noformat} > Warning: Map Join MAPJOIN[13][bigTable=?] in task 'Stage-3:MAPRED' is a cross > product > Warning: MASKED is a cross product > {noformat} > {noformat} > mapreduce.jobtracker.address=local > mapreduce.jobtracker.address=MASKED > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17569) Compare filtered output files in BeeLine tests
[ https://issues.apache.org/jira/browse/HIVE-17569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207567#comment-16207567 ] Marta Kuczora commented on HIVE-17569: -- Documented the "test.beeline.compare.portable" parameter on the wiki page: https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-BeelineQueryUnitTest > Compare filtered output files in BeeLine tests > -- > > Key: HIVE-17569 > URL: https://issues.apache.org/jira/browse/HIVE-17569 > Project: Hive > Issue Type: Improvement > Components: Testing Infrastructure >Affects Versions: 3.0.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora > Fix For: 3.0.0 > > Attachments: HIVE-17569.1.patch, HIVE-17569.2.patch > > > When running the BeeLine tests agains different configurations, the output of > certain commands, like explain can be different. Also the result of the > describe extended/formatted commands when running them on BeeLine have > different result then in the golden out files. To be able to reuse these out > files, we should have an option to filter out these commands. > The idea is to introduce a new property "test.beeline.compare.portable" with > the default value false and if this property is set to true, these commands > will be filtered out from the out files before the diff. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207563#comment-16207563 ] Hive QA commented on HIVE-15104: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892549/HIVE-15104.9.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 11276 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7348/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7348/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7348/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892549 - PreCommit-HIVE-Build > Hive on Spark generate more shuffle data than hive on mr > > > Key: HIVE-15104 > URL: https://issues.apache.org/jira/browse/HIVE-15104 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli >Assignee: Rui Li > Attachments: HIVE-15104.1.patch, HIVE-15104.2.patch, > HIVE-15104.3.patch, HIVE-15104.4.patch, HIVE-15104.5.patch, > HIVE-15104.6.patch, HIVE-15104.7.patch, HIVE-15104.8.patch, > HIVE-15104.9.patch, TPC-H 100G.xlsx > > > the same sql, running on spark and mr engine, will generate different size > of shuffle data. > i think it is because of hive on mr just serialize part of HiveKey, but hive > on spark which using kryo will serialize full of Hivekey object. > what is your opionion? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17230) Timestamp format different in HiveCLI and Beeline
[ https://issues.apache.org/jira/browse/HIVE-17230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207557#comment-16207557 ] Marta Kuczora commented on HIVE-17230: -- Thanks a lot [~kgyrtkirk] for the investigation you did. You are right that the behavior of psql seems to be the most user friendly. However in this case we have to implement the same custom logic in BeeLine as in Hive CLI instead of just relaying on the string representation of the timestamps. This adds a bit more complexity and extra logic to the BeeLine code. If we want the same timestamp format in both BeeLine and Hive CLI, we can either change the BeeLine behavior or the CLI behavior and I think both solutions have advantages and disadvantages. Changing the behavior of the Hive CLI to display the timestamp like in BeeLine (with the trailing .0s) affects many q.out files. Changing the behavior of BeeLine (remove the trailing 0s) is a smaller change but in return we have to maintain the same formatting logic as in Hive CLI. [~aihuaxu], what are your thoughts about this? > Timestamp format different in HiveCLI and Beeline > - > > Key: HIVE-17230 > URL: https://issues.apache.org/jira/browse/HIVE-17230 > Project: Hive > Issue Type: Bug > Components: Beeline, CLI >Reporter: Peter Vary >Assignee: Marta Kuczora > Attachments: HIVE-17230.1.patch, HIVE-17230.2.patch, > HIVE-17230.3.patch > > > The issue can be reproduced with the following commands: > {code} > create table timestamp_test(t timestamp); > insert into table timestamp_test values('2000-01-01 01:00:00'); > select * from timestamp_test; > {code} > The timestamp is displayed without nanoseconds in HiveCLI: > {code} > 2000-01-01 01:00:00 > {code} > When the exact same timestamp is displayed in BeeLine it displays: > {code} > 2000-01-01 01:00:00.0 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17825) Socket not closed when trying to read files to copy over in replication from metadata
[ https://issues.apache.org/jira/browse/HIVE-17825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207522#comment-16207522 ] anishek commented on HIVE-17825: * org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] : ran fine on local machine other tests are failing from previous builds. > Socket not closed when trying to read files to copy over in replication from > metadata > - > > Key: HIVE-17825 > URL: https://issues.apache.org/jira/browse/HIVE-17825 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek >Priority: Critical > Labels: pull-request-available > Fix For: 3.0.0 > > Attachments: HIVE-17825.0.patch > > > for replication we create a _files in hdfs which lists the source files to be > copied over for a table/partition. _files is read in ReplCopyTask to read > what files to be copied. The File operations w.r.t to _files is not correct > and we leave the files open there, which leads to a lot of CLOSE_WAIT > connections to the source Data nodes from HS2 on the replica cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17823) Fix subquery Qtest of Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207500#comment-16207500 ] Hive QA commented on HIVE-17823: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892539/HIVE-17823.001.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11275 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7347/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7347/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7347/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892539 - PreCommit-HIVE-Build > Fix subquery Qtest of Hive on Spark > --- > > Key: HIVE-17823 > URL: https://issues.apache.org/jira/browse/HIVE-17823 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Dapeng Sun >Assignee: Dapeng Sun > Attachments: HIVE-17823.001.patch > > > The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 > introduced subquery fix. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine
[ https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-17433: Status: Patch Available (was: In Progress) > Vectorization: Support Decimal64 in Hive Query Engine > - > > Key: HIVE-17433 > URL: https://issues.apache.org/jira/browse/HIVE-17433 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch > > > Provide partial support for Decimal64 within Hive. By partial I mean that > our current decimal has a large surface area of features (rounding, multiply, > divide, remainder, power, big precision, and many more) but only a small > number has been identified as being performance hotspots. > Those are small precision decimals with precision <= 18 that fit within a > 64-bit long we are calling Decimal64 . Just as we optimize row-mode > execution engine hotspots by selectively adding new vectorization code, we > can treat the current decimal as the full featured one and add additional > Decimal64 optimization where query benchmarks really show it help. > This change creates a Decimal64ColumnVector. > This change currently detects small decimal with Hive for Vectorized text > input format and uses some new Decimal64 vectorized classes for comparison, > addition, and later perhaps a few GroupBy aggregations like sum, avg, min, > max. > The patch also supports a new annotation that can mark a > VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64). So, > in separate work those other formats such as ORC, PARQUET, etc can be done in > later JIRAs so they participate in the Decimal64 performance optimization. > The idea is when you annotate your input format with: > @VectorizedInputFormatSupports(supports = {DECIMAL_64}) > the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of > DecimalColumnVector. Upon an input format seeing Decimal64ColumnVector being > used, the input format can fill that column vector with decimal64 longs > instead of HiveDecimalWritable objects of DecimalColumnVector. > There will be a Hive environment variable > hive.vectorized.input.format.supports.enabled that has a string list of > supported features. The default will start as "decimal_64". It can be > turned off to allow for performance comparisons and testing. > The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY > key, value > Will have a vectorized explain plan looking like: > ... > Filter Operator > Filter Vectorization: > className: VectorFilterOperator > native: true > predicateExpression: > FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: > Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, > outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean > predicate: ((key - 100) < 200) (type: boolean) > ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine
[ https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-17433: Attachment: HIVE-17433.04.patch > Vectorization: Support Decimal64 in Hive Query Engine > - > > Key: HIVE-17433 > URL: https://issues.apache.org/jira/browse/HIVE-17433 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch > > > Provide partial support for Decimal64 within Hive. By partial I mean that > our current decimal has a large surface area of features (rounding, multiply, > divide, remainder, power, big precision, and many more) but only a small > number has been identified as being performance hotspots. > Those are small precision decimals with precision <= 18 that fit within a > 64-bit long we are calling Decimal64 . Just as we optimize row-mode > execution engine hotspots by selectively adding new vectorization code, we > can treat the current decimal as the full featured one and add additional > Decimal64 optimization where query benchmarks really show it help. > This change creates a Decimal64ColumnVector. > This change currently detects small decimal with Hive for Vectorized text > input format and uses some new Decimal64 vectorized classes for comparison, > addition, and later perhaps a few GroupBy aggregations like sum, avg, min, > max. > The patch also supports a new annotation that can mark a > VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64). So, > in separate work those other formats such as ORC, PARQUET, etc can be done in > later JIRAs so they participate in the Decimal64 performance optimization. > The idea is when you annotate your input format with: > @VectorizedInputFormatSupports(supports = {DECIMAL_64}) > the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of > DecimalColumnVector. Upon an input format seeing Decimal64ColumnVector being > used, the input format can fill that column vector with decimal64 longs > instead of HiveDecimalWritable objects of DecimalColumnVector. > There will be a Hive environment variable > hive.vectorized.input.format.supports.enabled that has a string list of > supported features. The default will start as "decimal_64". It can be > turned off to allow for performance comparisons and testing. > The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY > key, value > Will have a vectorized explain plan looking like: > ... > Filter Operator > Filter Vectorization: > className: VectorFilterOperator > native: true > predicateExpression: > FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: > Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, > outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean > predicate: ((key - 100) < 200) (type: boolean) > ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine
[ https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-17433: Status: In Progress (was: Patch Available) > Vectorization: Support Decimal64 in Hive Query Engine > - > > Key: HIVE-17433 > URL: https://issues.apache.org/jira/browse/HIVE-17433 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-17433.03.patch > > > Provide partial support for Decimal64 within Hive. By partial I mean that > our current decimal has a large surface area of features (rounding, multiply, > divide, remainder, power, big precision, and many more) but only a small > number has been identified as being performance hotspots. > Those are small precision decimals with precision <= 18 that fit within a > 64-bit long we are calling Decimal64 . Just as we optimize row-mode > execution engine hotspots by selectively adding new vectorization code, we > can treat the current decimal as the full featured one and add additional > Decimal64 optimization where query benchmarks really show it help. > This change creates a Decimal64ColumnVector. > This change currently detects small decimal with Hive for Vectorized text > input format and uses some new Decimal64 vectorized classes for comparison, > addition, and later perhaps a few GroupBy aggregations like sum, avg, min, > max. > The patch also supports a new annotation that can mark a > VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64). So, > in separate work those other formats such as ORC, PARQUET, etc can be done in > later JIRAs so they participate in the Decimal64 performance optimization. > The idea is when you annotate your input format with: > @VectorizedInputFormatSupports(supports = {DECIMAL_64}) > the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of > DecimalColumnVector. Upon an input format seeing Decimal64ColumnVector being > used, the input format can fill that column vector with decimal64 longs > instead of HiveDecimalWritable objects of DecimalColumnVector. > There will be a Hive environment variable > hive.vectorized.input.format.supports.enabled that has a string list of > supported features. The default will start as "decimal_64". It can be > turned off to allow for performance comparisons and testing. > The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY > key, value > Will have a vectorized explain plan looking like: > ... > Filter Operator > Filter Vectorization: > className: VectorFilterOperator > native: true > predicateExpression: > FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: > Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, > outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean > predicate: ((key - 100) < 200) (type: boolean) > ... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17817) Stabilize crossproduct warning message output order
[ https://issues.apache.org/jira/browse/HIVE-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-17817: Attachment: HIVE-17817.01.patch #1) use linkedHashMap to stabilize iterator order in TezWork - this should stabilize the choosen topological to depend on the planning steps; which should be the same between excecutions. > Stabilize crossproduct warning message output order > --- > > Key: HIVE-17817 > URL: https://issues.apache.org/jira/browse/HIVE-17817 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17817.01.patch > > > {{CrossProductCheck}} warning printout sometimes happens in reverse order; > which reduces people's confidence in the test's reliability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17817) Stabilize crossproduct warning message output order
[ https://issues.apache.org/jira/browse/HIVE-17817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-17817: Status: Patch Available (was: Open) > Stabilize crossproduct warning message output order > --- > > Key: HIVE-17817 > URL: https://issues.apache.org/jira/browse/HIVE-17817 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17817.01.patch > > > {{CrossProductCheck}} warning printout sometimes happens in reverse order; > which reduces people's confidence in the test's reliability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-8937) fix description of hive.security.authorization.sqlstd.confwhitelist.* params
[ https://issues.apache.org/jira/browse/HIVE-8937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207239#comment-16207239 ] Hive QA commented on HIVE-8937: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892538/HIVE-8937.002.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11242 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=110) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_notin] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_scalar] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_select] (batchId=119) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_views] (batchId=108) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query16] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] (batchId=243) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query94] (batchId=243) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query16] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query94] (batchId=241) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7346/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7346/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7346/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892538 - PreCommit-HIVE-Build > fix description of hive.security.authorization.sqlstd.confwhitelist.* params > > > Key: HIVE-8937 > URL: https://issues.apache.org/jira/browse/HIVE-8937 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 0.14.0 >Reporter: Thejas M Nair >Assignee: Akira Ajisaka > Attachments: HIVE-8937.001.patch, HIVE-8937.002.patch > > > hive.security.authorization.sqlstd.confwhitelist.* param description in > HiveConf is incorrect. The expected value is a regex, not comma separated > regexes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17508) Implement global execution triggers based on counters
[ https://issues.apache.org/jira/browse/HIVE-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17508: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Test failures are unrelated to this patch. Committed to master. Thanks for the reviews! > Implement global execution triggers based on counters > - > > Key: HIVE-17508 > URL: https://issues.apache.org/jira/browse/HIVE-17508 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 3.0.0 > > Attachments: HIVE-17508.1.patch, HIVE-17508.10.patch, > HIVE-17508.11.patch, HIVE-17508.12.patch, HIVE-17508.13.patch, > HIVE-17508.2.patch, HIVE-17508.3.patch, HIVE-17508.3.patch, > HIVE-17508.4.patch, HIVE-17508.5.patch, HIVE-17508.6.patch, > HIVE-17508.7.patch, HIVE-17508.8.patch, HIVE-17508.9.patch, > HIVE-17508.WIP.2.patch, HIVE-17508.WIP.patch > > > Workload management can defined Triggers that are bound to a resource plan. > Each trigger can have a trigger expression and an action associated with it. > Trigger expressions are evaluated at runtime after configurable check > interval, based on which actions like killing a query, moving a query to > different pool etc. will get invoked. Simple execution trigger could be > something like > {code} > CREATE TRIGGER slow_query IN global > WHEN execution_time_ms > 1 > MOVE TO slow_queue > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)