[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Patch Available (was: Open) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Open (was: Patch Available) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] >
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Attachment: HIVE-16793.6.patch > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"]
[jira] [Updated] (HIVE-17078) Add more logs to MapredLocalTask
[ https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yibing Shi updated HIVE-17078: -- Attachment: HIVE-17078.3.patch Add a bit more logs > Add more logs to MapredLocalTask > > > Key: HIVE-17078 > URL: https://issues.apache.org/jira/browse/HIVE-17078 > Project: Hive > Issue Type: Improvement >Reporter: Yibing Shi >Assignee: Yibing Shi >Priority: Minor > Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch, > HIVE-17078.3.patch > > > By default, {{MapredLocalTask}} is executed in a child process of Hive, in > case the local task uses too much resources that may affect Hive. Currently, > the stdout and stderr information of the child process is printed in Hive's > stdout/stderr log, which doesn't have a timestamp information, and is > separated from Hive service logs. This makes it hard to troubleshoot problems > in MapredLocalTasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16541) PTF: Avoid shuffling constant keys for empty OVER()
[ https://issues.apache.org/jira/browse/HIVE-16541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084941#comment-16084941 ] Ashutosh Chauhan commented on HIVE-16541: - In some of golden files result set has changed, which looks incorrect. > PTF: Avoid shuffling constant keys for empty OVER() > --- > > Key: HIVE-16541 > URL: https://issues.apache.org/jira/browse/HIVE-16541 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-16541.1.patch, HIVE-16541.2.patch > > > Generating surrogate keys with > {code} > select row_number() over() as p_key, * from table; > {code} > uses a sorted edge with "0 ASC NULLS FIRST" as the sort order. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables
[ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084914#comment-16084914 ] Hive QA commented on HIVE-12631: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876915/HIVE-12631.17.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 10871 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=237) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid_fast] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_reader] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] (batchId=56) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=239) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout (batchId=226) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5990/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5990/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5990/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12876915 - PreCommit-HIVE-Build > LLAP: support ORC ACID tables > - > > Key: HIVE-12631 > URL: https://issues.apache.org/jira/browse/HIVE-12631 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, > HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, > HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, > HIVE-12631.17.patch, HIVE-12631.1.patch, HIVE-12631.2.patch, > HIVE-12631.3.patch, HIVE-12631.4.patch, HIVE-12631.5.patch, > HIVE-12631.6.patch, HIVE-12631.7.patch, HIVE-12631.8.patch, > HIVE-12631.8.patch, HIVE-12631.9.patch > > > LLAP uses a completely separate read path in ORC to allow for caching and > parallelization of reads and processing. This path does not support ACID. As > far as I remember ACID logic is embedded inside ORC format; we need to > refactor it to be on top of some interface, if practical; or just port it to > LLAP read path. > Another consideration is how the logic will work with cache. The cache is > currently low-level (CB-level in ORC), so we could just use it to read bases > and deltas (deltas should be cached with higher priority) and merge as usual. > We could also cache merged representation in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084918#comment-16084918 ] Hive QA commented on HIVE-16793: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876952/HIVE-16793.5.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5991/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5991/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5991/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-07-12 23:46:50.471 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5991/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-07-12 23:46:50.474 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 353781c..6af30bf master -> origin/master + git reset --hard HEAD HEAD is now at 353781c HIVE-17079: LLAP: Use FQDN by default for work submission (Prasanth Jayachandran reviewed by Gopal V) + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderAdaptor.java Removing ql/src/test/queries/clientpositive/llap_acid_fast.q Removing ql/src/test/results/clientpositive/llap/llap_acid.q.out Removing ql/src/test/results/clientpositive/llap/llap_acid_fast.q.out Removing ql/src/test/results/clientpositive/llap_acid_fast.q.out + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at 6af30bf HIVE-16832 duplicate ROW__ID possible in multi insert into transactional table (Eugene Koifman, reviewed by Gopal V) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-07-12 23:46:56.393 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java: No such file or directory error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSubQueryRemoveRule.java: No such file or directory error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java: No such file or directory error: a/ql/src/test/queries/clientpositive/subquery_scalar.q: No such file or directory error: a/ql/src/test/results/clientpositive/llap/subquery_scalar.q.out: No such file or directory error: a/ql/src/test/results/clientpositive/perf/query14.q.out: No such file or directory error: a/ql/src/test/results/clientpositive/perf/query23.q.out: No such file or directory The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12876952 - PreCommit-HIVE-Build > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task
[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16996: --- Attachment: HIVE-16966.04.patch > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: Accuracy and performance comparison between HyperLogLog > and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, > HIVE-16966.03.patch, HIVE-16966.04.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16996: --- Status: Open (was: Patch Available) > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: Accuracy and performance comparison between HyperLogLog > and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, > HIVE-16966.03.patch, HIVE-16966.04.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16996: --- Attachment: (was: HIVE-16966.04.patch) > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: Accuracy and performance comparison between HyperLogLog > and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, > HIVE-16966.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17021) Support replication of concatenate operation.
[ https://issues.apache.org/jira/browse/HIVE-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084950#comment-16084950 ] Daniel Dai commented on HIVE-17021: --- +1. Will commit shortly. > Support replication of concatenate operation. > - > > Key: HIVE-17021 > URL: https://issues.apache.org/jira/browse/HIVE-17021 > Project: Hive > Issue Type: Sub-task > Components: Hive, repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > Attachments: HIVE-17021.01.patch > > > We need to handle cases like ALTER TABLE ... CONCATENATE that also change the > files on disk, and potentially treat them similar to INSERT OVERWRITE, as it > does something equivalent to a compaction. > Note that a ConditionalTask might also be fired at the end of inserts at the > end of a tez task (or other exec engine) if appropriate HiveConf settings are > set, to automatically do this operation - these also need to be taken care of > for replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16926) LlapTaskUmbilicalExternalClient should not start new umbilical server for every fragment request
[ https://issues.apache.org/jira/browse/HIVE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-16926: -- Attachment: HIVE-16926.5.patch Patch v5, with changes per feedback. > LlapTaskUmbilicalExternalClient should not start new umbilical server for > every fragment request > > > Key: HIVE-16926 > URL: https://issues.apache.org/jira/browse/HIVE-16926 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-16926.1.patch, HIVE-16926.2.patch, > HIVE-16926.3.patch, HIVE-16926.4.patch, HIVE-16926.5.patch > > > Followup task from [~sseth] and [~sershe] after HIVE-16777. > LlapTaskUmbilicalExternalClient currently creates a new umbilical server for > every fragment request, but this is not necessary and the umbilical can be > shared. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16996: --- Status: Patch Available (was: Open) > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: Accuracy and performance comparison between HyperLogLog > and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, > HIVE-16966.03.patch, HIVE-16966.04.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16100) Dynamic Sorted Partition optimizer loses sibling operators
[ https://issues.apache.org/jira/browse/HIVE-16100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16084986#comment-16084986 ] Hive QA commented on HIVE-16100: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876961/HIVE-16100.5.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10888 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_gby_empty] (batchId=77) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_move_tasks_share_dependencies] (batchId=52) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reducesink_dedup] (batchId=22) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_lineage2] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_windowing_2] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5992/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5992/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5992/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12876961 - PreCommit-HIVE-Build > Dynamic Sorted Partition optimizer loses sibling operators > -- > > Key: HIVE-16100 > URL: https://issues.apache.org/jira/browse/HIVE-16100 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.2.1, 2.1.1, 2.2.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-16100.1.patch, HIVE-16100.2.patch, > HIVE-16100.2.patch, HIVE-16100.3.patch, HIVE-16100.4.patch, HIVE-16100.5.patch > > > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java#L173 > {code} > // unlink connection between FS and its parent > fsParent = fsOp.getParentOperators().get(0); > fsParent.getChildOperators().clear(); > {code} > The optimizer discards any cases where the fsParent has another SEL child -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16541) PTF: Avoid shuffling constant keys for empty OVER()
[ https://issues.apache.org/jira/browse/HIVE-16541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085004#comment-16085004 ] Gopal V commented on HIVE-16541: Thanks [~ashutoshc], the window non-streaming codepath seems to be the problem, I'll debug a bit more. > PTF: Avoid shuffling constant keys for empty OVER() > --- > > Key: HIVE-16541 > URL: https://issues.apache.org/jira/browse/HIVE-16541 > Project: Hive > Issue Type: Bug > Components: PTF-Windowing >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-16541.1.patch, HIVE-16541.2.patch > > > Generating surrogate keys with > {code} > select row_number() over() as p_key, * from table; > {code} > uses a sorted edge with "0 ASC NULLS FIRST" as the sort order. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables
[ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-12631: -- Attachment: HIVE-12631.18.patch > LLAP: support ORC ACID tables > - > > Key: HIVE-12631 > URL: https://issues.apache.org/jira/browse/HIVE-12631 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, > HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, > HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, > HIVE-12631.17.patch, HIVE-12631.18.patch, HIVE-12631.1.patch, > HIVE-12631.2.patch, HIVE-12631.3.patch, HIVE-12631.4.patch, > HIVE-12631.5.patch, HIVE-12631.6.patch, HIVE-12631.7.patch, > HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch > > > LLAP uses a completely separate read path in ORC to allow for caching and > parallelization of reads and processing. This path does not support ACID. As > far as I remember ACID logic is embedded inside ORC format; we need to > refactor it to be on top of some interface, if practical; or just port it to > LLAP read path. > Another consideration is how the logic will work with cache. The cache is > currently low-level (CB-level in ORC), so we could just use it to read bases > and deltas (deltas should be cached with higher priority) and merge as usual. > We could also cache merged representation in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085036#comment-16085036 ] Hive QA commented on HIVE-16793: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876977/HIVE-16793.6.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10876 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc (batchId=238) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5993/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5993/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5993/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12876977 - PreCommit-HIVE-Build > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60]
[jira] [Commented] (HIVE-16975) Vectorization: Fully vectorize CAST date as TIMESTAMP so VectorUDFAdaptor is now used
[ https://issues.apache.org/jira/browse/HIVE-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085040#comment-16085040 ] Matt McCline commented on HIVE-16975: - I don't see test failures related to this change. > Vectorization: Fully vectorize CAST date as TIMESTAMP so VectorUDFAdaptor is > now used > - > > Key: HIVE-16975 > URL: https://issues.apache.org/jira/browse/HIVE-16975 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Teddy Choi >Priority: Critical > Attachments: HIVE-16975.1.patch > > > Fix VectorUDFAdaptor(CAST(d_date as TIMESTAMP)) to be native. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.
[ https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4577: -- Attachment: HIVE-4577.7.patch Add the golden file for dfscmd.q > hive CLI can't handle hadoop dfs command with space and quotes. > > > Key: HIVE-4577 > URL: https://issues.apache.org/jira/browse/HIVE-4577 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.9.0, 0.10.0, 0.14.0, 0.13.1, 1.2.0, 1.1.0 >Reporter: Bing Li >Assignee: Bing Li > Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch, > HIVE-4577.3.patch.txt, HIVE-4577.4.patch, HIVE-4577.5.patch, > HIVE-4577.6.patch, HIVE-4577.7.patch > > > As design, hive could support hadoop dfs command in hive shell, like > hive> dfs -mkdir /user/biadmin/mydir; > but has different behavior with hadoop if the path contains space and quotes > hive> dfs -mkdir "hello"; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:40 > /user/biadmin/"hello" > hive> dfs -mkdir 'world'; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:43 > /user/biadmin/'world' > hive> dfs -mkdir "bei jing"; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:44 > /user/biadmin/"bei > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:44 > /user/biadmin/jing" -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17069) Refactor OrcRawRecrodMerger.ReaderPair
[ https://issues.apache.org/jira/browse/HIVE-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17069: -- Attachment: HIVE-17069.02.patch > Refactor OrcRawRecrodMerger.ReaderPair > -- > > Key: HIVE-17069 > URL: https://issues.apache.org/jira/browse/HIVE-17069 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17069.01.patch, HIVE-17069.02.patch > > > this should be done post HIVE-16177 so as not to obscure the functional > changes completely > Make ReaderPair an interface > ReaderPairImpl - will do what ReaderPair currently does, i.e. handle "normal" > code path > OriginalReaderPair - same as now but w/o incomprehensible override/variable > shadowing logic. > Perhaps split it into 2 - 1 for compaction 1 for "normal" read with common > base class. > Push discoverKeyBounds() into appropriate implementation -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17069) Refactor OrcRawRecrodMerger.ReaderPair
[ https://issues.apache.org/jira/browse/HIVE-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17069: -- Status: Patch Available (was: Open) > Refactor OrcRawRecrodMerger.ReaderPair > -- > > Key: HIVE-17069 > URL: https://issues.apache.org/jira/browse/HIVE-17069 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17069.01.patch, HIVE-17069.02.patch > > > this should be done post HIVE-16177 so as not to obscure the functional > changes completely > Make ReaderPair an interface > ReaderPairImpl - will do what ReaderPair currently does, i.e. handle "normal" > code path > OriginalReaderPair - same as now but w/o incomprehensible override/variable > shadowing logic. > Perhaps split it into 2 - 1 for compaction 1 for "normal" read with common > base class. > Push discoverKeyBounds() into appropriate implementation -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
[ https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085069#comment-16085069 ] Bing Li commented on HIVE-16922: [~lirui], I can't reproduce TestMiniLlapLocalCliDriver[vector_if_expr] in my env, I don't think it caused by this patch. > Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim" > --- > > Key: HIVE-16922 > URL: https://issues.apache.org/jira/browse/HIVE-16922 > Project: Hive > Issue Type: Bug > Components: Thrift API >Reporter: Dudu Markovitz >Assignee: Bing Li > Attachments: HIVE-16922.1.patch, HIVE-16922.2.patch > > > https://github.com/apache/hive/blob/master/serde/if/serde.thrift > Typo in serde.thrift: > COLLECTION_DELIM = "colelction.delim" > (*colelction* instead of *collection*) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16960) Hive throws an ugly error exception when HDFS sticky bit is set
[ https://issues.apache.org/jira/browse/HIVE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Janaki Lahorani updated HIVE-16960: --- Attachment: HIVE16960.3.patch > Hive throws an ugly error exception when HDFS sticky bit is set > --- > > Key: HIVE-16960 > URL: https://issues.apache.org/jira/browse/HIVE-16960 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani >Priority: Critical > Labels: newbie > Fix For: 3.0.0 > > Attachments: HIVE16960.1.patch, HIVE16960.2.patch, HIVE16960.2.patch, > HIVE16960.3.patch > > > When calling LOAD DATA INPATH ... OVERWRITE INTO TABLE ... from a Hive user > other than the HDFS file owner, and the HDFS sticky bit is set, then Hive > will throw an error exception message that the file cannot be moved due to > permission issues. > Caused by: org.apache.hadoop.security.AccessControlException: Permission > denied by sticky bit setting: user=hive, > inode=sasdata-2016-04-20-17-13-43-630-e-1.dlv.bk > The permission denied is expected, but the error message does not make sense > to users + the stack trace displayed is huge. We should display a better > error message to users, and maybe provide with help information about how to > fix it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17078) Add more logs to MapredLocalTask
[ https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085078#comment-16085078 ] Hive QA commented on HIVE-17078: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876980/HIVE-17078.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10888 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=237) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_without_localtask] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_8] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_convert_join] (batchId=51) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=12) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5994/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5994/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5994/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12876980 - PreCommit-HIVE-Build > Add more logs to MapredLocalTask > > > Key: HIVE-17078 > URL: https://issues.apache.org/jira/browse/HIVE-17078 > Project: Hive > Issue Type: Improvement >Reporter: Yibing Shi >Assignee: Yibing Shi >Priority: Minor > Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch, > HIVE-17078.3.patch > > > By default, {{MapredLocalTask}} is executed in a child process of Hive, in > case the local task uses too much resources that may affect Hive. Currently, > the stdout and stderr information of the child process is printed in Hive's > stdout/stderr log, which doesn't have a timestamp information, and is > separated from Hive service logs. This makes it hard to troubleshoot problems > in MapredLocalTasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16922) Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim"
[ https://issues.apache.org/jira/browse/HIVE-16922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085090#comment-16085090 ] Rui Li commented on HIVE-16922: --- Hi [~libing], I guess we also need metastore upgrade scripts to update the values that are already stored in DB, so that users can continue use the data when they upgrade to the new hive version. > Typo in serde.thrift: COLLECTION_DELIM = "colelction.delim" > --- > > Key: HIVE-16922 > URL: https://issues.apache.org/jira/browse/HIVE-16922 > Project: Hive > Issue Type: Bug > Components: Thrift API >Reporter: Dudu Markovitz >Assignee: Bing Li > Attachments: HIVE-16922.1.patch, HIVE-16922.2.patch > > > https://github.com/apache/hive/blob/master/serde/if/serde.thrift > Typo in serde.thrift: > COLLECTION_DELIM = "colelction.delim" > (*colelction* instead of *collection*) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16926) LlapTaskUmbilicalExternalClient should not start new umbilical server for every fragment request
[ https://issues.apache.org/jira/browse/HIVE-16926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085144#comment-16085144 ] Hive QA commented on HIVE-16926: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876981/HIVE-16926.5.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10889 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=237) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5995/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5995/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5995/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12876981 - PreCommit-HIVE-Build > LlapTaskUmbilicalExternalClient should not start new umbilical server for > every fragment request > > > Key: HIVE-16926 > URL: https://issues.apache.org/jira/browse/HIVE-16926 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-16926.1.patch, HIVE-16926.2.patch, > HIVE-16926.3.patch, HIVE-16926.4.patch, HIVE-16926.5.patch > > > Followup task from [~sseth] and [~sershe] after HIVE-16777. > LlapTaskUmbilicalExternalClient currently creates a new umbilical server for > every fragment request, but this is not necessary and the umbilical can be > shared. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085146#comment-16085146 ] Hive QA commented on HIVE-16996: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876982/HIVE-16966.04.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5996/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5996/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5996/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-07-13 03:48:51.176 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5996/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-07-13 03:48:51.178 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 31a7987 HIVE-16975: Vectorization: Fully vectorize CAST date as TIMESTAMP so VectorUDFAdaptor is now used (Teddy Choi, reviewed by Matt McCline) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 31a7987 HIVE-16975: Vectorization: Fully vectorize CAST date as TIMESTAMP so VectorUDFAdaptor is now used (Teddy Choi, reviewed by Matt McCline) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-07-13 03:48:57.351 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch fatal: git apply: bad git-diff - inconsistent old filename on line 33 The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12876982 - PreCommit-HIVE-Build > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: Accuracy and performance comparison between HyperLogLog > and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, > HIVE-16966.03.patch, HIVE-16966.04.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15898) add Type2 SCD merge tests
[ https://issues.apache.org/jira/browse/HIVE-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-15898: -- Attachment: HIVE-15898.08.patch > add Type2 SCD merge tests > - > > Key: HIVE-15898 > URL: https://issues.apache.org/jira/browse/HIVE-15898 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-15898.01.patch, HIVE-15898.02.patch, > HIVE-15898.03.patch, HIVE-15898.04.patch, HIVE-15898.05.patch, > HIVE-15898.06.patch, HIVE-15898.07.patch, HIVE-15898.08.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-12631) LLAP: support ORC ACID tables
[ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085173#comment-16085173 ] Hive QA commented on HIVE-12631: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12876993/HIVE-12631.18.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10892 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid_fast] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] (batchId=10) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters1] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=151) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5997/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5997/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5997/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12876993 - PreCommit-HIVE-Build > LLAP: support ORC ACID tables > - > > Key: HIVE-12631 > URL: https://issues.apache.org/jira/browse/HIVE-12631 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, > HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, > HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, > HIVE-12631.17.patch, HIVE-12631.18.patch, HIVE-12631.1.patch, > HIVE-12631.2.patch, HIVE-12631.3.patch, HIVE-12631.4.patch, > HIVE-12631.5.patch, HIVE-12631.6.patch, HIVE-12631.7.patch, > HIVE-12631.8.patch, HIVE-12631.8.patch, HIVE-12631.9.patch > > > LLAP uses a completely separate read path in ORC to allow for caching and > parallelization of reads and processing. This path does not support ACID. As > far as I remember ACID logic is embedded inside ORC format; we need to > refactor it to be on top of some interface, if practical; or just port it to > LLAP read path. > Another consideration is how the logic will work with cache. The cache is > currently low-level (CB-level in ORC), so we could just use it to read bases > and deltas (deltas should be cached with higher priority) and merge as usual. > We could also cache merged representation in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.
[ https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085176#comment-16085176 ] Hive QA commented on HIVE-4577: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12877002/HIVE-4577.7.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5998/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5998/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5998/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-07-13 04:50:25.104 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5998/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-07-13 04:50:25.107 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 31a7987 HIVE-16975: Vectorization: Fully vectorize CAST date as TIMESTAMP so VectorUDFAdaptor is now used (Teddy Choi, reviewed by Matt McCline) + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java.orig Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderAdaptor.java Removing ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java.orig Removing ql/src/test/queries/clientpositive/llap_acid_fast.q Removing ql/src/test/results/clientpositive/llap/llap_acid.q.out Removing ql/src/test/results/clientpositive/llap/llap_acid_fast.q.out Removing ql/src/test/results/clientpositive/llap_acid_fast.q.out + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 31a7987 HIVE-16975: Vectorization: Fully vectorize CAST date as TIMESTAMP so VectorUDFAdaptor is now used (Teddy Choi, reviewed by Matt McCline) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-07-13 04:50:30.983 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/ql/src/java/org/apache/hadoop/hive/ql/processors/DfsProcessor.java: No such file or directory error: a/ql/src/test/results/clientpositive/perf/query14.q.out: No such file or directory The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12877002 - PreCommit-HIVE-Build > hive CLI can't handle hadoop dfs command with space and quotes. > > > Key: HIVE-4577 > URL: https://issues.apache.org/jira/browse/HIVE-4577 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.9.0, 0.10.0, 0.14.0, 0.13.1, 1.2.0, 1.1.0 >Reporter: Bing Li >Assignee: Bing Li > Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch, > HIVE-4577.3.patch.txt, HIVE-4577.4.patch, HIVE-4577.5.patch, > HIVE-4577.6.patch, HIVE-4577.7.patch > > > As design, hive could support hadoop dfs command in hive shell, like > hive> dfs -mkdir /user/biadmin/mydir; > but has different behavior with hadoop if the path contains space and quotes > hive> dfs -mkdir "hello"; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:40 > /user/biadmin/"hello" > hive> dfs -mkdir 'world'; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:43 > /user/biadmin/'world' > hive> dfs -mkdir "bei jing"; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:44 > /user/biadmin/"bei > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:44 > /user/biadmin/jing
[jira] [Commented] (HIVE-17069) Refactor OrcRawRecrodMerger.ReaderPair
[ https://issues.apache.org/jira/browse/HIVE-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085178#comment-16085178 ] Hive QA commented on HIVE-17069: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12877003/HIVE-17069.02.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5999/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5999/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5999/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-07-13 04:50:59.904 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5999/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-07-13 04:50:59.906 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 31a7987 HIVE-16975: Vectorization: Fully vectorize CAST date as TIMESTAMP so VectorUDFAdaptor is now used (Teddy Choi, reviewed by Matt McCline) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 31a7987 HIVE-16975: Vectorization: Fully vectorize CAST date as TIMESTAMP so VectorUDFAdaptor is now used (Teddy Choi, reviewed by Matt McCline) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-07-13 04:51:00.740 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java:290 error: ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java: patch does not apply error: patch failed: ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java:35 error: ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java: patch does not apply The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12877003 - PreCommit-HIVE-Build > Refactor OrcRawRecrodMerger.ReaderPair > -- > > Key: HIVE-17069 > URL: https://issues.apache.org/jira/browse/HIVE-17069 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17069.01.patch, HIVE-17069.02.patch > > > this should be done post HIVE-16177 so as not to obscure the functional > changes completely > Make ReaderPair an interface > ReaderPairImpl - will do what ReaderPair currently does, i.e. handle "normal" > code path > OriginalReaderPair - same as now but w/o incomprehensible override/variable > shadowing logic. > Perhaps split it into 2 - 1 for compaction 1 for "normal" read with common > base class. > Push discoverKeyBounds() into appropriate implementation -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16996: --- Attachment: HIVE-16966.05.patch rebase to master > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: Accuracy and performance comparison between HyperLogLog > and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, > HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16996: --- Status: Patch Available (was: Open) > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: Accuracy and performance comparison between HyperLogLog > and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, > HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.
[ https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085203#comment-16085203 ] Vaibhav Gumashta commented on HIVE-4577: [~libing] Looks like patch v7 didn't apply on master > hive CLI can't handle hadoop dfs command with space and quotes. > > > Key: HIVE-4577 > URL: https://issues.apache.org/jira/browse/HIVE-4577 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 0.9.0, 0.10.0, 0.14.0, 0.13.1, 1.2.0, 1.1.0 >Reporter: Bing Li >Assignee: Bing Li > Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch, > HIVE-4577.3.patch.txt, HIVE-4577.4.patch, HIVE-4577.5.patch, > HIVE-4577.6.patch, HIVE-4577.7.patch > > > As design, hive could support hadoop dfs command in hive shell, like > hive> dfs -mkdir /user/biadmin/mydir; > but has different behavior with hadoop if the path contains space and quotes > hive> dfs -mkdir "hello"; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:40 > /user/biadmin/"hello" > hive> dfs -mkdir 'world'; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:43 > /user/biadmin/'world' > hive> dfs -mkdir "bei jing"; > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:44 > /user/biadmin/"bei > drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:44 > /user/biadmin/jing" -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
[ https://issues.apache.org/jira/browse/HIVE-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16996: --- Status: Open (was: Patch Available) > Add HLL as an alternative to FM sketch to compute stats > --- > > Key: HIVE-16996 > URL: https://issues.apache.org/jira/browse/HIVE-16996 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: Accuracy and performance comparison between HyperLogLog > and FM Sketch.docx, HIVE-16966.01.patch, HIVE-16966.02.patch, > HIVE-16966.03.patch, HIVE-16966.04.patch, HIVE-16966.05.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16960) Hive throws an ugly error exception when HDFS sticky bit is set
[ https://issues.apache.org/jira/browse/HIVE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085218#comment-16085218 ] Hive QA commented on HIVE-16960: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12877008/HIVE16960.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10890 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6000/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6000/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6000/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12877008 - PreCommit-HIVE-Build > Hive throws an ugly error exception when HDFS sticky bit is set > --- > > Key: HIVE-16960 > URL: https://issues.apache.org/jira/browse/HIVE-16960 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Janaki Lahorani >Assignee: Janaki Lahorani >Priority: Critical > Labels: newbie > Fix For: 3.0.0 > > Attachments: HIVE16960.1.patch, HIVE16960.2.patch, HIVE16960.2.patch, > HIVE16960.3.patch > > > When calling LOAD DATA INPATH ... OVERWRITE INTO TABLE ... from a Hive user > other than the HDFS file owner, and the HDFS sticky bit is set, then Hive > will throw an error exception message that the file cannot be moved due to > permission issues. > Caused by: org.apache.hadoop.security.AccessControlException: Permission > denied by sticky bit setting: user=hive, > inode=sasdata-2016-04-20-17-13-43-630-e-1.dlv.bk > The permission denied is expected, but the error message does not make sense > to users + the stack trace displayed is huge. We should display a better > error message to users, and maybe provide with help information about how to > fix it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17078) Add more logs to MapredLocalTask
[ https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085252#comment-16085252 ] Yibing Shi commented on HIVE-17078: --- Checked the failed tests. # org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] fails with something irrelevant to this patch # org.apache.hive.hcatalog.api.TestHCatClient. The failure also has nothing to do our patch. # org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14/23] fails because the output has changed in order. Nothing serious. We need to somehow update the .out files, but maybe in a separate JIRA # The other tests fails because now we have more logs in local tasks. Will update the .out files. > Add more logs to MapredLocalTask > > > Key: HIVE-17078 > URL: https://issues.apache.org/jira/browse/HIVE-17078 > Project: Hive > Issue Type: Improvement >Reporter: Yibing Shi >Assignee: Yibing Shi >Priority: Minor > Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch, > HIVE-17078.3.patch > > > By default, {{MapredLocalTask}} is executed in a child process of Hive, in > case the local task uses too much resources that may affect Hive. Currently, > the stdout and stderr information of the child process is printed in Hive's > stdout/stderr log, which doesn't have a timestamp information, and is > separated from Hive service logs. This makes it hard to troubleshoot problems > in MapredLocalTasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17078) Add more logs to MapredLocalTask
[ https://issues.apache.org/jira/browse/HIVE-17078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yibing Shi updated HIVE-17078: -- Attachment: HIVE-17078.4.PATCH > Add more logs to MapredLocalTask > > > Key: HIVE-17078 > URL: https://issues.apache.org/jira/browse/HIVE-17078 > Project: Hive > Issue Type: Improvement >Reporter: Yibing Shi >Assignee: Yibing Shi >Priority: Minor > Attachments: HIVE-17078.1.patch, HIVE-17078.2.patch, > HIVE-17078.3.patch, HIVE-17078.4.PATCH > > > By default, {{MapredLocalTask}} is executed in a child process of Hive, in > case the local task uses too much resources that may affect Hive. Currently, > the stdout and stderr information of the child process is printed in Hive's > stdout/stderr log, which doesn't have a timestamp information, and is > separated from Hive service logs. This makes it hard to troubleshoot problems > in MapredLocalTasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.4.patch Update patch v4: 1. Moved the registrator code to a resource file. Hopefully the patch is more readable. 2. To be safe, we still have to store the hash code. But that's still better than the generic serializer. > Hive on Spark generate more shuffle data than hive on mr > > > Key: HIVE-15104 > URL: https://issues.apache.org/jira/browse/HIVE-15104 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli >Assignee: Rui Li > Attachments: HIVE-15104.1.patch, HIVE-15104.2.patch, > HIVE-15104.3.patch, HIVE-15104.4.patch, TPC-H 100G.xlsx > > > the same sql, running on spark and mr engine, will generate different size > of shuffle data. > i think it is because of hive on mr just serialize part of HiveKey, but hive > on spark which using kryo will serialize full of Hivekey object. > what is your opionion? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15898) add Type2 SCD merge tests
[ https://issues.apache.org/jira/browse/HIVE-15898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085275#comment-16085275 ] Hive QA commented on HIVE-15898: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12877014/HIVE-15898.08.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10890 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_type2_scd] (batchId=144) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout (batchId=226) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6001/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6001/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6001/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12877014 - PreCommit-HIVE-Build > add Type2 SCD merge tests > - > > Key: HIVE-15898 > URL: https://issues.apache.org/jira/browse/HIVE-15898 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-15898.01.patch, HIVE-15898.02.patch, > HIVE-15898.03.patch, HIVE-15898.04.patch, HIVE-15898.05.patch, > HIVE-15898.06.patch, HIVE-15898.07.patch, HIVE-15898.08.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085276#comment-16085276 ] Rui Li commented on HIVE-15104: --- I also run another round of TPC-DS. The overall shuffle data is reduced by 12%. The query time improvement is however negligible - about 1.5%. [~xuefuz] do you think it's worth the effort? > Hive on Spark generate more shuffle data than hive on mr > > > Key: HIVE-15104 > URL: https://issues.apache.org/jira/browse/HIVE-15104 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: wangwenli >Assignee: Rui Li > Attachments: HIVE-15104.1.patch, HIVE-15104.2.patch, > HIVE-15104.3.patch, HIVE-15104.4.patch, TPC-H 100G.xlsx > > > the same sql, running on spark and mr engine, will generate different size > of shuffle data. > i think it is because of hive on mr just serialize part of HiveKey, but hive > on spark which using kryo will serialize full of Hivekey object. > what is your opionion? -- This message was sent by Atlassian JIRA (v6.4.14#64029)