[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055160#comment-14055160 ] Szehon Ho commented on HIVE-7220: - Thanks. Yea I forgot to comment, the failures do not look related, and dynpart_sort_optimization passed so it was a random issue. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.5.patch, HIVE-7220.5.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054531#comment-14054531 ] Navis commented on HIVE-7220: - +1 Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.5.patch, HIVE-7220.5.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052481#comment-14052481 ] Hive QA commented on HIVE-7220: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12654060/HIVE-7220.5.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5691 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.cli.TestPermsGrp.testCustomPerms org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/681/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/681/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-681/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12654060 Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.5.patch, HIVE-7220.5.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045609#comment-14045609 ] Hive QA commented on HIVE-7220: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652750/HIVE-7220.5.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5670 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/613/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/613/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-613/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652750 Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.5.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045614#comment-14045614 ] Szehon Ho commented on HIVE-7220: - dynpart_sort_optimization seems to be failing consistently, I'll try to take a look. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.5.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044384#comment-14044384 ] Szehon Ho commented on HIVE-7220: - Forgot to rebase. Thank you [~hagleitn] for that. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044524#comment-14044524 ] Hive QA commented on HIVE-7220: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652552/HIVE-7220.4.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5669 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/597/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/597/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-597/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652552 Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045200#comment-14045200 ] Szehon Ho commented on HIVE-7220: - It's weird that dynpart_sort_optimization failed twice here, but I couldn't reproduce it running locally. Also looked through the test logs and didnt see any related stacktraces. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045423#comment-14045423 ] Navis commented on HIVE-7220: - Looks good. As the last one, isValidSplit() and getDirIndices() could be merged into single method? It could reduce FS access/call. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045424#comment-14045424 ] Gopal V commented on HIVE-7220: --- Is the de-dup of locations only to work around the FileInputSplit in-mem + disk location changes? (i.e 3 in-mem locations and 3 on-disk locations having duplication?). Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045440#comment-14045440 ] Szehon Ho commented on HIVE-7220: - [~navis] ok ill look at that [~gopalv] i dont think there is any change to dedup, which was called even before this patch. i just moved the method. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044123#comment-14044123 ] Szehon Ho commented on HIVE-7220: - ping, do people still want to proceed with this, to fix the last test failure? Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044344#comment-14044344 ] Navis commented on HIVE-7220: - [~szehon] Seemed need to be rebased on trunk. Could you do that once more? Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040070#comment-14040070 ] Hive QA commented on HIVE-7220: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12651812/HIVE-7220.2.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5668 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/549/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/549/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-549/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12651812 Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040260#comment-14040260 ] Szehon Ho commented on HIVE-7220: - [~hagleitn], This last one seems to works, should we move forward with it? Other failures are known failures. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.2.patch, HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039337#comment-14039337 ] Gunther Hagleitner commented on HIVE-7220: -- You're right, the IsValidSplit only drops directory only splits. My bad. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037843#comment-14037843 ] Gunther Hagleitner commented on HIVE-7220: -- I think we should move forward with this it will give us a working build, while we work out MAPREDUCE-5756. We have HIVE-6401 open to handle the situation when we get a fix. I've reviewed the patch, it looks good except for the isValidSplit call. Why is that needed? You prune in the constructor so presumably you never get splits containing folders. If this is just a sanity check it should probably throw an assertion if there's still paths in there. If not - it seems incorrect to throw out splits that don't match (especially since you might throw out combined valid locations with it). Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030328#comment-14030328 ] Szehon Ho commented on HIVE-7220: - OK, never mind about this patch. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029962#comment-14029962 ] Ashutosh Chauhan commented on HIVE-7220: duplicate of HIVE-6401. There is underlying MR bug here. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030178#comment-14030178 ] Navis commented on HIVE-7220: - Can we just remove this test? Who makes external table on root directory? Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030248#comment-14030248 ] Szehon Ho commented on HIVE-7220: - Yea, but the test did catch a real issue (a folder in any external table directory causes error). I guess its best if Hadoop reverts back to old behavior (in 2.5?). But this patch should fix it on the Hive side, as another option, either way. Empty dir in external table causes issue (root_dir_external_table.q failure) Key: HIVE-7220 URL: https://issues.apache.org/jira/browse/HIVE-7220 Project: Hive Issue Type: Bug Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7220.patch While looking at root_dir_external_table.q failure, which is doing a query on an external table located at root ('/'), I noticed that latest Hadoop2 CombineFileInputFormat returns split representing empty directories (like '/Users'), which leads to failure in Hive's CombineFileRecordReader as it tries to open the directory for processing. Tried with an external table in a normal HDFS directory, and it also returns the same error. Looks like a real bug. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)
[ https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030270#comment-14030270 ] Hive QA commented on HIVE-7220: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12649988/HIVE-7220.patch {color:red}ERROR:{color} -1 due to 82 failed/errored test(s), 5610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby1_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby6_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_noskew_multi_single_reducer org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_map org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_noskew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert2_overwrite_partitions org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_fs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_vectorization_ppd org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_createas1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_not org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_or org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20