[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088990#comment-14088990 ] Lefty Leverenz commented on HIVE-7231: -- The wiki now documents *hive.exec.orc.default.block.size*, *hive.exec.orc.block.padding.tolerance*, and the changed default for *hive.exec.orc.default.stripe.size* in 0.14.0: * [Configuration Properties -- hive.exec.orc.default.block.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.default.block.size] * [Configuration Properties -- hive.exec.orc.block.padding.tolerance | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.block.padding.tolerance] * [Configuration Properties -- hive.exec.orc.default.stripe.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.default.stripe.size] Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.14.0 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053398#comment-14053398 ] Lefty Leverenz commented on HIVE-7231: -- Facepalm! Now that the patch is committed I've finally noticed that hive.exec.orc.block.padding.tolerance is not a percentage but a decimal fraction. For example, with a 64 MB stripe size the default 0.05 gives 3.2 MB tolerance (0.05 * 64, not 0.05% of 64). This is only a tech-writer's quibble which isn't likely to confuse anyone. I'll explain it in the wiki and put a request in HIVE-6586 to fix it with HIVE-6037. Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: TODOC14, orcfile Fix For: 0.14.0 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053242#comment-14053242 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12654227/HIVE-7231.8.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5677 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/683/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/683/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-683/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12654227 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053253#comment-14053253 ] Gopal V commented on HIVE-7231: --- Test failures unrelated. Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050998#comment-14050998 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653697/HIVE-7231.7.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5672 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/664/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/664/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-664/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653697 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048713#comment-14048713 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653314/HIVE-7231.6.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5671 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthBinary.org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthBinary {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/645/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/645/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-645/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653314 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048327#comment-14048327 ] Gopal V commented on HIVE-7231: --- With rebase update to docs. it LGTM - +1 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048378#comment-14048378 ] Lefty Leverenz commented on HIVE-7231: -- Woops, very sorry -- forgot to publish my second review, which requested clarification in the description of hive.exec.orc.block.padding.tolerance in HiveConf.java: {code} +// Define the tolerance for block padding. The total padded length will +// always be less than the specified percentage. {code} My comment: bq. Should mention that it's a percentage of stripe size, because block padding sounds like percentage of block size. Could also explain that block padding prevents stripes from straddling blocks. But this isn't a show stopper. Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048404#comment-14048404 ] Gopal V commented on HIVE-7231: --- [~leftylev]: HIVE-7231.5.patch has that clarified in hive-default.xml and with a default case (64Mb stripe, 256Mb block = 3.2Mb padding tolerance per 256Mb block). Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048416#comment-14048416 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653268/HIVE-7231.5.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/635/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/635/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-635/ Messages: {noformat} This message was trimmed, see log for full details [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-shims-0.20 --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-shims-0.20 --- [INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/shims/0.20/target/hive-shims-0.20-0.14.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-shims-0.20 --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-shims-0.20 --- [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/shims/0.20/target/hive-shims-0.20-0.14.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/shims/hive-shims-0.20/0.14.0-SNAPSHOT/hive-shims-0.20-0.14.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/shims/0.20/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/shims/hive-shims-0.20/0.14.0-SNAPSHOT/hive-shims-0.20-0.14.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Shims Secure Common 0.14.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-shims-common-secure --- [INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-shims-common-secure --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-shims-common-secure --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-shims-common-secure --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-shims-common-secure --- [INFO] Compiling 12 source files to /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/target/classes [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java: /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java uses or overrides a deprecated API. [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java: Recompile with -Xlint:deprecation for details. [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java: Some input files use unchecked or unsafe operations. [WARNING] /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java: Recompile with -Xlint:unchecked for details. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hive-shims-common-secure --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-shims-common-secure --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/target/tmp/conf [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/shims/common-secure/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] ---
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048417#comment-14048417 ] Lefty Leverenz commented on HIVE-7231: -- [~gopalv], I must have been looking at the wrong patch. Thanks. Is the coded ampersand (amp;) necessary in hive-default.xml.template? Perhaps a simple and would be clearer. But that's a mini-nit. +1 for docs Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048431#comment-14048431 ] Gopal V commented on HIVE-7231: --- Looks like I jumped the gun and rebased my patch to also use tez-0.4.1, which is why tests failed. Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048531#comment-14048531 ] Gopal V commented on HIVE-7231: --- Tests on 1Tb proving that this does cut down on padding, but it progressively writes smaller and smaller stripes within a block. I saw 12MB, 8Mb stripes being written before the 3.2Mb stripe size trigger sets in and triggers a pad event. {code} Resetting stripe size via (1.0 - 0.00) * (0.663954 * 66945840) = 8964 Resetting stripe size via (1.0 - 0.00) * (0.495154 * 8964) = 22009074 Resetting stripe size via (1.0 - 0.00) * (0.358696 * 22009074) = 7894571 Resetting stripe size via (1.0 - 0.00) * (0.263782 * 7894571) = 2082443 Resetting stripe size via (1.0 - 0.00) * (0.581675 * 2082443) = 1211304 Resetting stripe size via (1.0 - 0.00) * (0.814780 * 1211304) = 986946 Resetting stripe size via (1.0 - 0.00) * (0.772579 * 986946) = 762494 {code} I think I might undo the as a fraction of stripe size bit and make sure that the padding amount is a fraction of the HDFS block size for consistent stripe sizes as much as possible. Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047059#comment-14047059 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653039/HIVE-7231.4.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5670 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/625/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/625/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-625/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653039 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039773#comment-14039773 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12651598/HIVE-7231.3.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5668 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/539/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/539/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-539/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12651598 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034409#comment-14034409 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650660/HIVE-7231.2.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5536 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/492/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/492/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-492/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650660 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034434#comment-14034434 ] Gopal V commented on HIVE-7231: --- The approach results in stray writes across the stripe boundaries. I think this approach needs to be revisited to disconnect the HDFS block size from the ORC stripe size. The stripe size needs to be a factor of the HDFS block size, but the fraction should not remain at 0.5x. Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14032109#comment-14032109 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12650431/HIVE-7231.1.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 5536 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/479/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/479/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-479/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12650431 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)