[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359812#comment-16359812 ] Rui Li commented on HIVE-18350: --- Thanks [~djaiswal], it works now. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358001#comment-16358001 ] Deepak Jaiswal commented on HIVE-18350: --- Hi, I reverted my patch. It should not happen anymore. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356776#comment-16356776 ] Rui Li commented on HIVE-18350: --- Hi [~djaiswal], with this change, I hit [this error|https://issues.apache.org/jira/browse/HIVE-18647?focusedCommentId=16356761&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16356761] when creating table. Could you please take a look? Thanks. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356626#comment-16356626 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909645/HIVE-18350.16.patch {color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 12970 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] (batchId=250) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9087/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9087/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9087/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12909645 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happ
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356610#comment-16356610 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 6s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s{color} | {color:red} standalone-metastore: The patch generated 2 new + 391 unchanged - 1 fixed = 393 total (was 392) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 43s{color} | {color:red} ql: The patch generated 9 new + 204 unchanged - 4 fixed = 213 total (was 208) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 55s{color} | {color:red} root: The patch generated 11 new + 604 unchanged - 5 fixed = 615 total (was 609) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s{color} | {color:red} The patch has 8 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 5s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 16b8575 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus/diff-checkstyle-standalone-metastore.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus/whitespace-eol.txt | | modules | C: standalone-metastore ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, > HIVE-18350.2.patch, HIVE-18350
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356186#comment-16356186 ] Sergey Shelukhin commented on HIVE-18350: - +1 from my side, pending others' feedback, and tests > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355198#comment-16355198 ] Deepak Jaiswal commented on HIVE-18350: --- Yet another attempt to rebase the patch. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355173#comment-16355173 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909489/HIVE-18350.15.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9065/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9065/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9065/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-02-07 08:58:57.108 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-9065/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-02-07 08:58:57.111 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 2422e18..acc62e3 master -> origin/master + git reset --hard HEAD HEAD is now at 2422e18 HIVE-18467: support whole warehouse dump / load + create/drop database events (Anishek Agarwal, reviewed by Sankar Hariappan) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at acc62e3 HIVE-18628: Make tez dag status check interval configurable (Prasanth Jayachandran reviewed by Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-02-07 08:59:01.197 + rm -rf ../yetus + mkdir ../yetus + git gc + cp -R . ../yetus + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9065/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java: does not exist in index error: a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java: does not exist in index error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q: does not exist in index error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q: does not exist in index error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q: does not exist in index error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q: does not exist in index error: a/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out: does not exist in index error: a/ql/src/test/results/clien
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354063#comment-16354063 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909384/HIVE-18350.14.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9050/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9050/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9050/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-02-06 15:58:23.562 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-9050/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-02-06 15:58:23.565 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 443b10b HIVE-18612: Build subprocesses under Yetus in Ptest use 1.7 jre instead of 1.8 (Adam Szita via Peter Vary) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 443b10b HIVE-18612: Build subprocesses under Yetus in Ptest use 1.7 jre instead of 1.8 (Adam Szita via Peter Vary) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-02-06 15:58:26.425 + rm -rf ../yetus + mkdir ../yetus + git gc + cp -R . ../yetus + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9050/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java: does not exist in index error: a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java: does not exist in index error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q: does not exist in index error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q: does not exist in index error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q: does not exist in index error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q: does not exist in index error: a/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/llap/auto_sortmerge_join_2.q.out: does not exist in index error: a/ql/src/test/r
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353199#comment-16353199 ] Sergey Shelukhin commented on HIVE-18350: - Left some comments on RB > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351521#comment-16351521 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909107/HIVE-18350.13.patch {color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 12970 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9002/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9002/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9002/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12909107 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise i
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351508#comment-16351508 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 37s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 38s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 4s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s{color} | {color:red} ql: The patch generated 6 new + 204 unchanged - 2 fixed = 210 total (was 206) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 42s{color} | {color:red} root: The patch generated 6 new + 605 unchanged - 2 fixed = 611 total (was 607) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 45m 57s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 4a33ec8 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9002/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9002/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-9002/yetus/whitespace-eol.txt | | modules | C: standalone-metastore ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-9002/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351233#comment-16351233 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909069/HIVE-18350.12.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 40 failed/errored test(s), 12966 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.org.apache.hadoop.hive.cli.TestContribNegativeCliDriver (batchId=245) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.TestReplChangeManager.testRecycleNonPartTable (batchId=223) org.apache.hadoop.hive.metastore.cache.TestCachedStore.testTableOps (batchId=213) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.TestBeeLineWithArgs.testEscapeCRLFOffInDSVOutput (batchId=231) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=231) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testMulti (batchId=202) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchAbort (batchId=202) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchCommitPartitioned (batchId=202) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchCommitUnpartitioned (batchId=202) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchEmptyAbortPartitioned (batchId=202) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchEmptyAbortUnartitioned (batchId=202) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchEmptyCommitPartitioned (batchId=202) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchEmptyCommitUnpartitioned (batchId=202) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testUpdatesAndDeletes (batchId=202) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill (batchId=235) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8994/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8994/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8994/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 40 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12909069 - PreCommit-HIVE-Build > load data should rename files con
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351228#comment-16351228 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 47s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 47s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 3s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 6 new + 204 unchanged - 2 fixed = 210 total (was 206) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 58s{color} | {color:red} root: The patch generated 6 new + 597 unchanged - 2 fixed = 603 total (was 599) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 47m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 4a33ec8 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8994/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8994/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-8994/yetus/whitespace-eol.txt | | modules | C: standalone-metastore ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8994/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.2.patch, > HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, > HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries i
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351165#comment-16351165 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909051/HIVE-18350.11.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 28 failed/errored test(s), 12967 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=180) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.cache.TestCachedStore.testTableOps (batchId=213) org.apache.hadoop.hive.metastore.client.TestAddPartitions.testAddPartitionsNullColTypeInSd[Embedded] (batchId=206) org.apache.hadoop.hive.metastore.client.TestAppendPartitions.testAppendPartWrongColumnInPartName[Embedded] (batchId=206) org.apache.hadoop.hive.metastore.client.TestGetTableMeta.testGetTableMetaCaseSensitive[Embedded] (batchId=206) org.apache.hadoop.hive.metastore.client.TestGetTableMeta.testGetTableMetaNullOrEmptyDb[Embedded] (batchId=206) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8992/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8992/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8992/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 28 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12909051 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.2.patch, > HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, > HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table,
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351146#comment-16351146 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 25s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 37s{color} | {color:red} ql: The patch generated 6 new + 204 unchanged - 2 fixed = 210 total (was 206) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 56s{color} | {color:red} root: The patch generated 6 new + 597 unchanged - 2 fixed = 603 total (was 599) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 1s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / f9efd84 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8992/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8992/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-8992/yetus/whitespace-eol.txt | | modules | C: standalone-metastore ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8992/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.11.patch, HIVE-18350.2.patch, HIVE-18350.3.patch, > HIVE-18350.4.patch, HIVE-18350.5.patch, HIVE-18350.6.patch, > HIVE-18350.7.patch, HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need c
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350539#comment-16350539 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12908892/HIVE-18350.10.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 32 failed/errored test(s), 12966 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] (batchId=248) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7] (batchId=59) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_7] (batchId=133) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.TestReplChangeManager.testRecycleNonPartTable (batchId=223) org.apache.hadoop.hive.metastore.TestReplChangeManager.testRecyclePartTable (batchId=223) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded] (batchId=206) org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded] (batchId=206) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hadoop.hive.ql.metadata.TestHive.testTable (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHive.testThriftTable (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testTable (batchId=279) org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testThriftTable (batchId=279) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8989/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8989/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8989/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 32 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12908892 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult i
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350534#comment-16350534 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 44s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 49s{color} | {color:red} ql: The patch generated 6 new + 186 unchanged - 2 fixed = 192 total (was 188) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 2m 13s{color} | {color:red} root: The patch generated 6 new + 195 unchanged - 2 fixed = 201 total (was 197) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 10m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 63m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 39f1e82 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8989/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8989/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-8989/yetus/whitespace-eol.txt | | modules | C: standalone-metastore ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8989/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming conv
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350292#comment-16350292 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12908892/HIVE-18350.10.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 12966 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] (batchId=248) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status_disable_bitvector] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7] (batchId=59) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=180) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin6] (batchId=180) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_7] (batchId=133) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded] (batchId=206) org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded] (batchId=206) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hadoop.hive.ql.metadata.TestHive.testTable (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHive.testThriftTable (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testTable (batchId=279) org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testThriftTable (batchId=279) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[2] (batchId=193) org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[3] (batchId=193) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8987/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8987/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8987/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 33 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12908892 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350254#comment-16350254 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 51s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 15s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 6 new + 186 unchanged - 2 fixed = 192 total (was 188) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 45s{color} | {color:red} root: The patch generated 6 new + 195 unchanged - 2 fixed = 201 total (was 197) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 8m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 47m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 39f1e82 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8987/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8987/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-8987/yetus/whitespace-eol.txt | | modules | C: standalone-metastore ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8987/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming conv
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350031#comment-16350031 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12908892/HIVE-18350.10.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 12965 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] (batchId=248) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7] (batchId=59) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_7] (batchId=133) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] (batchId=250) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded] (batchId=206) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hadoop.hive.ql.metadata.TestHive.testTable (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHive.testThriftTable (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testTable (batchId=279) org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testThriftTable (batchId=279) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8984/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8984/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8984/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 27 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12908892 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of th
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350008#comment-16350008 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 34s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 49s{color} | {color:red} ql: The patch generated 6 new + 186 unchanged - 2 fixed = 192 total (was 188) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 52s{color} | {color:red} root: The patch generated 6 new + 195 unchanged - 2 fixed = 201 total (was 197) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 39f1e82 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8984/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8984/yetus/diff-checkstyle-root.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-8984/yetus/whitespace-eol.txt | | modules | C: standalone-metastore ql . U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8984/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming conv
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349540#comment-16349540 ] Deepak Jaiswal commented on HIVE-18350: --- Updated the patch based on Jason's comments. Adding [~thejas] to review. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, > HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, > HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, > HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349250#comment-16349250 ] Deepak Jaiswal commented on HIVE-18350: --- [~jdere] [~gopalv] [~ekoifman] can you please review? > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, > HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, > HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch, HIVE-18350.9.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349133#comment-16349133 ] Deepak Jaiswal commented on HIVE-18350: --- Please ignore the failure of test smb_mapjoin_7. its fix is coming in HIVE-18516. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, > HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, > HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be correct. > For non-bucketed tables and external tables, there is no difference in > behavior and reloading data is not needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348862#comment-16348862 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12908713/HIVE-18350.8.patch {color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 12965 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] (batchId=248) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7] (batchId=59) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[bucket_mapjoin_mismatch1] (batchId=95) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_7] (batchId=133) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.client.TestTablesGetExists.testGetAllTablesCaseInsensitive[Embedded] (batchId=206) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8969/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8969/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8969/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12908713 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, > HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, > HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further div
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348834#comment-16348834 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 30s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 55s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 39s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 45s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s{color} | {color:red} standalone-metastore: The patch generated 1 new + 429 unchanged - 1 fixed = 430 total (was 430) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 6 new + 186 unchanged - 2 fixed = 192 total (was 188) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 52s{color} | {color:red} root: The patch generated 9 new + 665 unchanged - 3 fixed = 674 total (was 668) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 10s{color} | {color:red} itests/hcatalog-unit: The patch generated 2 new + 20 unchanged - 0 fixed = 22 total (was 20) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 51s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 55m 27s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 419593e | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/diff-checkstyle-standalone-metastore.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/diff-checkstyle-ql.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/diff-checkstyle-root.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/diff-checkstyle-itests_hcatalog-unit.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/whitespace-eol.txt | | modules | C: standalone-metastore ql hcatalog/core . itests/hcatalog-unit itests/hive-unit U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 >
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330251#comment-16330251 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12906517/HIVE-18350.6.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8672/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8672/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8672/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-01-18 08:45:26.307 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-8672/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-01-18 08:45:26.310 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 80e6f7b HIVE-18386 : Create dummy materialized views registry and make it configurable (Jesus Camacho Rodriguez via Ashutosh Chauhan) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 80e6f7b HIVE-18386 : Create dummy materialized views registry and make it configurable (Jesus Camacho Rodriguez via Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-01-18 08:45:28.765 + rm -rf ../yetus + mkdir ../yetus + git gc + cp -R . ../yetus + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-8672/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractBucketJoinProc.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java: does not exist in index error: a/ql/src/test/queries/clientpositive/smb_mapjoin_7.q: does not exist in index error: a/ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/smb_mapjoin_7.q.out: does not exist in index error: a/ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out: does not exist in index Going to apply patch with: git apply -p1 + [[ maven == \m\a\v\e\n ]] + rm -rf /data/hiveptest/working/maven/org/apache/hive + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven protoc-jar: protoc version: 250, detected platform: linux/amd64 protoc-jar: executing: [/tmp/protoc4865680965774138742.exe, -I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore, --java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources, /data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto] ANTLR Parser Generator Version 3.5.2 Output file /data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources/org/apache/hadoop/hive/metastore/parser/FilterParser.java does not exist: must build /data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g org/apache/hadoop/hi
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328307#comment-16328307 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12906327/HIVE-18350.5.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 75 failed/errored test(s), 11565 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_text] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_data_rename] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_fs] (batchId=82) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_fs_overwrite] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_orc] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_orc_part] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_loaddata] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[offset_limit_global_optimizer] (batchId=19) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[load_fs2] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=170) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=160) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=178) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap3] (batchId=179) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap_auto] (batchId=177) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[load_fs2] (batchId=179) org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[index_bitmap3] (batchId=92) org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[index_bitmap_auto] (batchId=91) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part] (batchId=94) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[bucket_mapjoin_mismatch1] (batchId=94) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_into_acid] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_orc_negative2] (batchId=93) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_orc_negative_part] (batchId=94) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=121) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=219) org.apache.hadoop.hive.ql.TestAcidOnTez.testInsertWithRemoveUnion (batchId=222) org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataNonAcid2AcidConversion (batchId=257) org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataNonAcid2AcidConversionVectorized (batchId=257) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testToAcidConversionMultiBucket (batchId=278) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testToAcidConversionMultiBucket (batchId=278) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=254) org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testCopyExistingFilesOnDifferentFileSystem[0] (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testCopyExistingFilesOnDifferentFileSystem[15] (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testCopyNewFilesOnDifferentFileSystem[0] (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testCopyNewFilesOnDifferentFileSystem[15] (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testRenameExistingFilesOnSameFileSystem[0] (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testRenameExistingFilesOnSameFileSystem[15] (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testRenameNewFilesOnSameFileSystem[0] (batchId=278) org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testRenameNewFilesOnSameFileSystem[15] (batchId=278) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConcatenatePartitionedTable (batchId=226) org.apache.hadoop.hive.ql.parse.TestRepli
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328271#comment-16328271 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s{color} | {color:red} ql: The patch generated 10 new + 540 unchanged - 5 fixed = 550 total (was 545) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 13m 0s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 798a17c | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8649/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8649/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, > HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the > bucket in non-strict mode. Hive assumes that the data belongs to same bucket > in a file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. > For existing tables in customer database, it is recommended to reload > bucketed tables otherwise if customer tries to run SMB join and there is a > bucket for which there is no split, then there is a possibility of getting > incorrect results. However, this is not a regression as it would happen even > without the patch. > With this patch however, and reloading data, the results should be
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325090#comment-16325090 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12905986/HIVE-18350.3.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 80 failed/errored test(s), 11561 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[combine1] (batchId=54) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_map] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_map_multi_single_reducer] (batchId=5) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_map_skew] (batchId=43) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_noskew] (batchId=84) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_noskew_multi_single_reducer] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap_compression] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compression] (batchId=10) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_list_bucket] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input44] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lb_fs_stats] (batchId=89) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_11] (batchId=18) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_12] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_13] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_14] (batchId=18) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_1] (batchId=18) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_2] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_3] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_4] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_5] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_6] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_7] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_8] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_9] (batchId=83) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_multiskew_1] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_multiskew_2] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_multiskew_3] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_data_rename] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part11] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite_4] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_loaddata] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[offset_limit_global_optimizer] (batchId=19) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat3] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_list_bucket] (batchId=68) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=170) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=165) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[list_bucket_dml_10] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=160) org.apache.hadoop.hive.cli.
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325073#comment-16325073 ] Hive QA commented on HIVE-18350: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 34s{color} | {color:red} ql: The patch generated 9 new + 354 unchanged - 6 fixed = 363 total (was 360) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 12m 53s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / b1cdbc6 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8611/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8611/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, > HIVE-18350.3.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the bucket > in non-strict mode. Hive assumes that the data belongs to same bucket in a > file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324948#comment-16324948 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12905920/HIVE-18350.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8605/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8605/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8605/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-01-13 04:01:39.732 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-8605/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-01-13 04:01:39.735 + cd apache-github-source-source + git fetch origin Auto packing the repository in background for optimum performance. See "git help gc" for manual housekeeping. + git reset --hard HEAD HEAD is now at b1cdbc6 HIVE-18416: Initial support for TABLE function (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) (addendum) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at b1cdbc6 HIVE-18416: Initial support for TABLE function (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) (addendum) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-01-13 04:01:44.211 + rm -rf ../yetus + mkdir ../yetus + cp -R . ../yetus cp: cannot stat ?./.git/gc.pid?: No such file or directory + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12905920 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the bucket > in non-strict mode. Hive assumes that the data belongs to same bucket in a > file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324501#comment-16324501 ] Deepak Jaiswal commented on HIVE-18350: --- [~jdere] [~ekoifman] can you please review? > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the bucket > in non-strict mode. Hive assumes that the data belongs to same bucket in a > file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant due to which it is further divided into two subtasks for smoother > merge. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements
[ https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313805#comment-16313805 ] Hive QA commented on HIVE-18350: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12904723/HIVE-18350.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8465/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8465/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8465/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-01-05 20:12:59.353 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-8465/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-01-05 20:12:59.357 + cd apache-github-source-source + git fetch origin Auto packing the repository in background for optimum performance. See "git help gc" for manual housekeeping. + git reset --hard HEAD HEAD is now at b0e653a HIVE-18354: Fix test TestAcidOnTez (Zoltan Haindrich, reviewed by Eugene Koifman) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at b0e653a HIVE-18354: Fix test TestAcidOnTez (Zoltan Haindrich, reviewed by Eugene Koifman) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-01-05 20:13:03.882 + rm -rf ../yetus + mkdir ../yetus + cp -R . ../yetus cp: cannot stat ?./.git/objects/78/c315fa77c09546ca46eebd7721092b0a7868d2?: No such file or directory cp: cannot stat ?./.git/gc.pid?: No such file or directory + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12904723 - PreCommit-HIVE-Build > load data should rename files consistent with insert statements > --- > > Key: HIVE-18350 > URL: https://issues.apache.org/jira/browse/HIVE-18350 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-18350.1.patch > > > Insert statements create files of format ending with _0, 0001_0 etc. > However, the load data uses the input file name. That results in inconsistent > naming convention which makes SMB joins difficult in some scenarios and may > cause trouble for other types of queries in future. > We need consistent naming convention. > For non-bucketed table, hive renames all the files regardless of how they > were named by the user. > For bucketed table, hive relies on user to name the files matching the bucket > in non-strict mode. Hive assumes that the data belongs to same bucket in a > file. In strict mode, loading bucketed table is disabled. > This will likely affect most of the tests which load data which is pretty > significant. -- This message was sent by Atlassian JIRA (v6.4.14#64029)