[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-10 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359812#comment-16359812
 ] 

Rui Li commented on HIVE-18350:
---

Thanks [~djaiswal], it works now.

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-08 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358001#comment-16358001
 ] 

Deepak Jaiswal commented on HIVE-18350:
---

Hi,

I reverted my patch. It should not happen anymore.

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-08 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356776#comment-16356776
 ] 

Rui Li commented on HIVE-18350:
---

Hi [~djaiswal], with this change, I hit [this 
error|https://issues.apache.org/jira/browse/HIVE-18647?focusedCommentId=16356761&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16356761]
 when creating table. Could you please take a look? Thanks.

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356626#comment-16356626
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909645/HIVE-18350.16.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 12970 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] 
(batchId=250)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9087/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9087/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9087/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12909645 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happ

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356610#comment-16356610
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} standalone-metastore: The patch generated 2 new + 391 
unchanged - 1 fixed = 393 total (was 392) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 9 new + 204 unchanged - 4 
fixed = 213 total (was 208) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
55s{color} | {color:red} root: The patch generated 11 new + 604 unchanged - 5 
fixed = 615 total (was 609) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch has 8 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 16b8575 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus/diff-checkstyle-standalone-metastore.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus/whitespace-eol.txt 
|
| modules | C: standalone-metastore ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9087/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, 
> HIVE-18350.2.patch, HIVE-18350

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356186#comment-16356186
 ] 

Sergey Shelukhin commented on HIVE-18350:
-

+1 from my side, pending others' feedback, and tests

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-07 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355198#comment-16355198
 ] 

Deepak Jaiswal commented on HIVE-18350:
---

Yet another attempt to rebase the patch.

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.14.patch, HIVE-18350.15.patch, HIVE-18350.16.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355173#comment-16355173
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909489/HIVE-18350.15.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9065/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9065/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9065/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-02-07 08:58:57.108
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9065/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-02-07 08:58:57.111
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   2422e18..acc62e3  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 2422e18 HIVE-18467: support whole warehouse dump / load + 
create/drop database events (Anishek Agarwal, reviewed by Sankar Hariappan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at acc62e3 HIVE-18628: Make tez dag status check interval 
configurable (Prasanth Jayachandran reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-02-07 08:59:01.197
+ rm -rf ../yetus
+ mkdir ../yetus
+ git gc
+ cp -R . ../yetus
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9065/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java: does not 
exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java:
 does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java: does not 
exist in index
error: a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java: does not 
exist in index
error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q: does not 
exist in index
error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q: does not 
exist in index
error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q: does not 
exist in index
error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q: does not 
exist in index
error: a/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out: does 
not exist in index
error: a/ql/src/test/results/clien

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354063#comment-16354063
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909384/HIVE-18350.14.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9050/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9050/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9050/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-02-06 15:58:23.562
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9050/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-02-06 15:58:23.565
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 443b10b HIVE-18612: Build subprocesses under Yetus in Ptest use 
1.7 jre instead of 1.8 (Adam Szita via Peter Vary)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 443b10b HIVE-18612: Build subprocesses under Yetus in Ptest use 
1.7 jre instead of 1.8 (Adam Szita via Peter Vary)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-02-06 15:58:26.425
+ rm -rf ../yetus
+ mkdir ../yetus
+ git gc
+ cp -R . ../yetus
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9050/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomVertexConfiguration.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java: does not 
exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java: does 
not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java:
 does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/plan/OpTraits.java: does not 
exist in index
error: a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java: does not 
exist in index
error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_2.q: does not 
exist in index
error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_4.q: does not 
exist in index
error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_5.q: does not 
exist in index
error: a/ql/src/test/queries/clientpositive/auto_sortmerge_join_7.q: does not 
exist in index
error: a/ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out: 
does not exist in index
error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/llap/auto_sortmerge_join_2.q.out: 
does not exist in index
error: a/ql/src/test/r

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16353199#comment-16353199
 ] 

Sergey Shelukhin commented on HIVE-18350:
-

Left some comments on RB

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351521#comment-16351521
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909107/HIVE-18350.13.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 12970 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9002/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9002/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9002/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12909107 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise i

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351508#comment-16351508
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
4s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 6 new + 204 unchanged - 2 
fixed = 210 total (was 206) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
42s{color} | {color:red} root: The patch generated 6 new + 605 unchanged - 2 
fixed = 611 total (was 607) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 45m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 4a33ec8 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9002/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9002/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9002/yetus/whitespace-eol.txt 
|
| modules | C: standalone-metastore ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9002/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.13.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for 

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351233#comment-16351233
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909069/HIVE-18350.12.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 40 failed/errored test(s), 12966 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.org.apache.hadoop.hive.cli.TestContribNegativeCliDriver
 (batchId=245)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.TestReplChangeManager.testRecycleNonPartTable 
(batchId=223)
org.apache.hadoop.hive.metastore.cache.TestCachedStore.testTableOps 
(batchId=213)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.TestBeeLineWithArgs.testEscapeCRLFOffInDSVOutput 
(batchId=231)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=231)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testMulti (batchId=202)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchAbort
 (batchId=202)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchCommitPartitioned
 (batchId=202)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchCommitUnpartitioned
 (batchId=202)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchEmptyAbortPartitioned
 (batchId=202)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchEmptyAbortUnartitioned
 (batchId=202)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchEmptyCommitPartitioned
 (batchId=202)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchEmptyCommitUnpartitioned
 (batchId=202)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testUpdatesAndDeletes 
(batchId=202)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8994/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8994/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8994/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 40 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12909069 - PreCommit-HIVE-Build

> load data should rename files con

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351228#comment-16351228
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
47s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
39s{color} | {color:red} ql: The patch generated 6 new + 204 unchanged - 2 
fixed = 210 total (was 206) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
58s{color} | {color:red} root: The patch generated 6 new + 597 unchanged - 2 
fixed = 603 total (was 599) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 4a33ec8 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8994/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8994/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8994/yetus/whitespace-eol.txt 
|
| modules | C: standalone-metastore ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8994/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.2.patch, 
> HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, 
> HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries i

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351165#comment-16351165
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909051/HIVE-18350.11.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 28 failed/errored test(s), 12967 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.cache.TestCachedStore.testTableOps 
(batchId=213)
org.apache.hadoop.hive.metastore.client.TestAddPartitions.testAddPartitionsNullColTypeInSd[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestAppendPartitions.testAppendPartWrongColumnInPartName[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestGetTableMeta.testGetTableMetaCaseSensitive[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestGetTableMeta.testGetTableMetaNullOrEmptyDb[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8992/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8992/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8992/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 28 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12909051 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.12.patch, HIVE-18350.2.patch, 
> HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, 
> HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table,

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351146#comment-16351146
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
25s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} ql: The patch generated 6 new + 204 unchanged - 2 
fixed = 210 total (was 206) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
56s{color} | {color:red} root: The patch generated 6 new + 597 unchanged - 2 
fixed = 603 total (was 599) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / f9efd84 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8992/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8992/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8992/yetus/whitespace-eol.txt 
|
| modules | C: standalone-metastore ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8992/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.11.patch, HIVE-18350.2.patch, HIVE-18350.3.patch, 
> HIVE-18350.4.patch, HIVE-18350.5.patch, HIVE-18350.6.patch, 
> HIVE-18350.7.patch, HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need c

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350539#comment-16350539
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908892/HIVE-18350.10.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 32 failed/errored test(s), 12966 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] 
(batchId=248)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7] 
(batchId=59)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_7] 
(batchId=133)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.TestReplChangeManager.testRecycleNonPartTable 
(batchId=223)
org.apache.hadoop.hive.metastore.TestReplChangeManager.testRecyclePartTable 
(batchId=223)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hadoop.hive.ql.metadata.TestHive.testTable (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHive.testThriftTable (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testTable (batchId=279)
org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testThriftTable (batchId=279)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8989/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8989/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8989/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 32 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12908892 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult i

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350534#comment-16350534
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
44s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
49s{color} | {color:red} ql: The patch generated 6 new + 186 unchanged - 2 
fixed = 192 total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
13s{color} | {color:red} root: The patch generated 6 new + 195 unchanged - 2 
fixed = 201 total (was 197) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 10m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 39f1e82 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8989/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8989/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8989/yetus/whitespace-eol.txt 
|
| modules | C: standalone-metastore ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8989/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming conv

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350292#comment-16350292
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908892/HIVE-18350.10.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 12966 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] 
(batchId=248)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_update_status_disable_bitvector]
 (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7] 
(batchId=59)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin6]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_7] 
(batchId=133)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hadoop.hive.ql.metadata.TestHive.testTable (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHive.testThriftTable (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testTable (batchId=279)
org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testThriftTable (batchId=279)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testSyntheticComplexSchema[2]
 (batchId=193)
org.apache.hive.hcatalog.pig.TestHCatLoaderComplexSchema.testTupleInBagInTupleInBag[3]
 (batchId=193)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8987/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8987/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8987/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 33 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12908892 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files 

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350254#comment-16350254
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 6 new + 186 unchanged - 2 
fixed = 192 total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
45s{color} | {color:red} root: The patch generated 6 new + 195 unchanged - 2 
fixed = 201 total (was 197) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 39f1e82 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8987/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8987/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8987/yetus/whitespace-eol.txt 
|
| modules | C: standalone-metastore ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8987/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming conv

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350031#comment-16350031
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908892/HIVE-18350.10.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 12965 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] 
(batchId=248)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7] 
(batchId=59)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_7] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] 
(batchId=250)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hadoop.hive.ql.metadata.TestHive.testTable (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHive.testThriftTable (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testTable (batchId=279)
org.apache.hadoop.hive.ql.metadata.TestHiveRemote.testThriftTable (batchId=279)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8984/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8984/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8984/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12908892 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of th

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350008#comment-16350008
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
34s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
49s{color} | {color:red} ql: The patch generated 6 new + 186 unchanged - 2 
fixed = 192 total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
52s{color} | {color:red} root: The patch generated 6 new + 195 unchanged - 2 
fixed = 201 total (was 197) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 39f1e82 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8984/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8984/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8984/yetus/whitespace-eol.txt 
|
| modules | C: standalone-metastore ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8984/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming conv

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-01 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349540#comment-16349540
 ] 

Deepak Jaiswal commented on HIVE-18350:
---

Updated the patch based on Jason's comments.

Adding [~thejas] to review.

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.10.patch, 
> HIVE-18350.2.patch, HIVE-18350.3.patch, HIVE-18350.4.patch, 
> HIVE-18350.5.patch, HIVE-18350.6.patch, HIVE-18350.7.patch, 
> HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-01 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349250#comment-16349250
 ] 

Deepak Jaiswal commented on HIVE-18350:
---

[~jdere] [~gopalv] [~ekoifman] can you please review?

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, 
> HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, 
> HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch, HIVE-18350.9.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-01 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349133#comment-16349133
 ] 

Deepak Jaiswal commented on HIVE-18350:
---

Please ignore the failure of test smb_mapjoin_7. its fix is coming in 
HIVE-18516.

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, 
> HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, 
> HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be correct.
> For non-bucketed tables and external tables, there is no difference in 
> behavior and reloading data is not needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348862#comment-16348862
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908713/HIVE-18350.8.patch

{color:green}SUCCESS:{color} +1 due to 11 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 12965 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] 
(batchId=248)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_7] 
(batchId=59)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[bucket_mapjoin_mismatch1]
 (batchId=95)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_7] 
(batchId=133)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestTablesGetExists.testGetAllTablesCaseInsensitive[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8969/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8969/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8969/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 26 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12908713 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, 
> HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch, 
> HIVE-18350.6.patch, HIVE-18350.7.patch, HIVE-18350.8.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further div

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-02-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16348834#comment-16348834
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
30s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} standalone-metastore: The patch generated 1 new + 429 
unchanged - 1 fixed = 430 total (was 430) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 6 new + 186 unchanged - 2 
fixed = 192 total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
52s{color} | {color:red} root: The patch generated 9 new + 665 unchanged - 3 
fixed = 674 total (was 668) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} itests/hcatalog-unit: The patch generated 2 new + 20 
unchanged - 0 fixed = 22 total (was 20) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 419593e |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/diff-checkstyle-standalone-metastore.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/diff-checkstyle-root.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/diff-checkstyle-itests_hcatalog-unit.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus/whitespace-eol.txt 
|
| modules | C: standalone-metastore ql hcatalog/core . itests/hcatalog-unit 
itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8969/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
>   

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-01-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330251#comment-16330251
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12906517/HIVE-18350.6.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8672/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8672/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8672/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-01-18 08:45:26.307
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-8672/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-01-18 08:45:26.310
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 80e6f7b HIVE-18386 : Create dummy materialized views registry 
and make it configurable (Jesus Camacho Rodriguez via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 80e6f7b HIVE-18386 : Create dummy materialized views registry 
and make it configurable (Jesus Camacho Rodriguez via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-01-18 08:45:28.765
+ rm -rf ../yetus
+ mkdir ../yetus
+ git gc
+ cp -R . ../yetus
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-8672/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: does not 
exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: does not 
exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractBucketJoinProc.java: 
does not exist in index
error: 
a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java:
 does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java: 
does not exist in index
error: a/ql/src/test/queries/clientpositive/smb_mapjoin_7.q: does not exist in 
index
error: a/ql/src/test/results/clientpositive/beeline/smb_mapjoin_7.q.out: does 
not exist in index
error: a/ql/src/test/results/clientpositive/smb_mapjoin_7.q.out: does not exist 
in index
error: a/ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out: does not 
exist in index
Going to apply patch with: git apply -p1
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: protoc version: 250, detected platform: linux/amd64
protoc-jar: executing: [/tmp/protoc4865680965774138742.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources,
 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g
org/apache/hadoop/hi

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-01-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328307#comment-16328307
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12906327/HIVE-18350.5.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 75 failed/errored test(s), 11565 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_text] (batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_data_rename] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_fs] (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_fs_overwrite] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_orc] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_orc_part] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_loaddata] (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[offset_limit_global_optimizer]
 (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[load_fs2] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=160)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap3]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[index_bitmap_auto]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[load_fs2] 
(batchId=179)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[index_bitmap3] 
(batchId=92)
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver[index_bitmap_auto] 
(batchId=91)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=94)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[bucket_mapjoin_mismatch1]
 (batchId=94)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_into_acid]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_orc_negative2]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_orc_negative_part]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=121)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=219)
org.apache.hadoop.hive.ql.TestAcidOnTez.testInsertWithRemoveUnion (batchId=222)
org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataNonAcid2AcidConversion 
(batchId=257)
org.apache.hadoop.hive.ql.TestTxnLoadData.loadDataNonAcid2AcidConversionVectorized
 (batchId=257)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testToAcidConversionMultiBucket 
(batchId=278)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testToAcidConversionMultiBucket
 (batchId=278)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=254)
org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testCopyExistingFilesOnDifferentFileSystem[0]
 (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testCopyExistingFilesOnDifferentFileSystem[15]
 (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testCopyNewFilesOnDifferentFileSystem[0]
 (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testCopyNewFilesOnDifferentFileSystem[15]
 (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testRenameExistingFilesOnSameFileSystem[0]
 (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testRenameExistingFilesOnSameFileSystem[15]
 (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testRenameNewFilesOnSameFileSystem[0]
 (batchId=278)
org.apache.hadoop.hive.ql.metadata.TestHiveCopyFiles.testRenameNewFilesOnSameFileSystem[15]
 (batchId=278)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConcatenatePartitionedTable
 (batchId=226)
org.apache.hadoop.hive.ql.parse.TestRepli

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-01-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328271#comment-16328271
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
35s{color} | {color:red} ql: The patch generated 10 new + 540 unchanged - 5 
fixed = 550 total (was 545) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 798a17c |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8649/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8649/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, 
> HIVE-18350.3.patch, HIVE-18350.4.patch, HIVE-18350.5.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
>  For bucketed table, hive relies on user to name the files matching the 
> bucket in non-strict mode. Hive assumes that the data belongs to same bucket 
> in a file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.
> For existing tables in customer database, it is recommended to reload 
> bucketed tables otherwise if customer tries to run SMB join and there is a 
> bucket for which there is no split, then there is a possibility of getting 
> incorrect results. However, this is not a regression as it would happen even 
> without the patch.
> With this patch however, and reloading data, the results should be

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-01-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325090#comment-16325090
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12905986/HIVE-18350.3.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 80 failed/errored test(s), 11561 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[combine1] (batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7] (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_map] (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_map_multi_single_reducer]
 (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_map_skew] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_noskew] 
(batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_noskew_multi_single_reducer]
 (batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_bitmap_compression]
 (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compression] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_list_bucket]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input44] (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lb_fs_stats] (batchId=89)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_11] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_12] 
(batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_13] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_14] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_1] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_2] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_3] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_4] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_5] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_6] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_7] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_8] 
(batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_dml_9] 
(batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_multiskew_1]
 (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_multiskew_2]
 (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[list_bucket_query_multiskew_3]
 (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_data_rename] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[load_dyn_part11] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[materialized_view_create_rewrite_4]
 (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_loaddata] (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[offset_limit_global_optimizer]
 (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat3] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_list_bucket] 
(batchId=68)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=165)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[list_bucket_dml_10]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=160)
org.apache.hadoop.hive.cli.

[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-01-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325073#comment-16325073
 ] 

Hive QA commented on HIVE-18350:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
34s{color} | {color:red} ql: The patch generated 9 new + 354 unchanged - 6 
fixed = 363 total (was 360) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / b1cdbc6 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8611/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8611/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch, 
> HIVE-18350.3.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
> For bucketed table, hive relies on user to name the files matching the bucket 
> in non-strict mode. Hive assumes that the data belongs to same bucket in a 
> file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-01-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324948#comment-16324948
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12905920/HIVE-18350.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8605/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8605/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8605/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-01-13 04:01:39.732
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-8605/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-01-13 04:01:39.735
+ cd apache-github-source-source
+ git fetch origin
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
+ git reset --hard HEAD
HEAD is now at b1cdbc6 HIVE-18416: Initial support for TABLE function (Jesus 
Camacho Rodriguez, reviewed by Ashutosh Chauhan) (addendum)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at b1cdbc6 HIVE-18416: Initial support for TABLE function (Jesus 
Camacho Rodriguez, reviewed by Ashutosh Chauhan) (addendum)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-01-13 04:01:44.211
+ rm -rf ../yetus
+ mkdir ../yetus
+ cp -R . ../yetus
cp: cannot stat ?./.git/gc.pid?: No such file or directory
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12905920 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
> For bucketed table, hive relies on user to name the files matching the bucket 
> in non-strict mode. Hive assumes that the data belongs to same bucket in a 
> file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-01-12 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324501#comment-16324501
 ] 

Deepak Jaiswal commented on HIVE-18350:
---

[~jdere] [~ekoifman] can you please review?

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-18350.1.patch, HIVE-18350.2.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
> For bucketed table, hive relies on user to name the files matching the bucket 
> in non-strict mode. Hive assumes that the data belongs to same bucket in a 
> file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant due to which it is further divided into two subtasks for smoother 
> merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18350) load data should rename files consistent with insert statements

2018-01-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16313805#comment-16313805
 ] 

Hive QA commented on HIVE-18350:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904723/HIVE-18350.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8465/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8465/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8465/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-01-05 20:12:59.353
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-8465/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-01-05 20:12:59.357
+ cd apache-github-source-source
+ git fetch origin
Auto packing the repository in background for optimum performance.
See "git help gc" for manual housekeeping.
+ git reset --hard HEAD
HEAD is now at b0e653a HIVE-18354: Fix test TestAcidOnTez  (Zoltan Haindrich, 
reviewed by Eugene Koifman)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at b0e653a HIVE-18354: Fix test TestAcidOnTez  (Zoltan Haindrich, 
reviewed by Eugene Koifman)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-01-05 20:13:03.882
+ rm -rf ../yetus
+ mkdir ../yetus
+ cp -R . ../yetus
cp: cannot stat ?./.git/objects/78/c315fa77c09546ca46eebd7721092b0a7868d2?: No 
such file or directory
cp: cannot stat ?./.git/gc.pid?: No such file or directory
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904723 - PreCommit-HIVE-Build

> load data should rename files consistent with insert statements
> ---
>
> Key: HIVE-18350
> URL: https://issues.apache.org/jira/browse/HIVE-18350
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-18350.1.patch
>
>
> Insert statements create files of format ending with _0, 0001_0 etc. 
> However, the load data uses the input file name. That results in inconsistent 
> naming convention which makes SMB joins difficult in some scenarios and may 
> cause trouble for other types of queries in future.
> We need consistent naming convention.
> For non-bucketed table, hive renames all the files regardless of how they 
> were named by the user.
> For bucketed table, hive relies on user to name the files matching the bucket 
> in non-strict mode. Hive assumes that the data belongs to same bucket in a 
> file. In strict mode, loading bucketed table is disabled.
> This will likely affect most of the tests which load data which is pretty 
> significant.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)