[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088968#comment-14088968 ] Lefty Leverenz commented on HIVE-6578: -- This adds configuration parameter *hive.stats.gather.num.threads* which is documented in the wiki here: * [Configuration Properties -- hive.stats.gather.num.threads | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.gather.num.threads] Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: TODOC13, orcfile Fix For: 0.13.0 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938123#comment-13938123 ] Prasanth J commented on HIVE-6578: -- Test failure is not related. Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937304#comment-13937304 ] Hive QA commented on HIVE-6578: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12634958/HIVE-6578.4.patch.txt {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5407 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1852/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1852/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12634958 Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13936065#comment-13936065 ] Hive QA commented on HIVE-6578: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12634864/HIVE-6578.4.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1798/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1798/console Messages: {noformat} This message was trimmed, see log for full details [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/hwi/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hwi --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/tmp/conf [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-hwi --- [INFO] Compiling 2 source files to /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/test-classes [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-hwi --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-hwi --- [INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/hive-hwi-0.14.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-hwi --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-hwi --- [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/hwi/target/hive-hwi-0.14.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-hwi/0.14.0-SNAPSHOT/hive-hwi-0.14.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/hwi/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-hwi/0.14.0-SNAPSHOT/hive-hwi-0.14.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive ODBC 0.14.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-odbc --- [INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/odbc (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-odbc --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-odbc --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-odbc --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/odbc/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/odbc/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/odbc/target/tmp/conf [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/odbc/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-odbc --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-odbc --- [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/odbc/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-odbc/0.14.0-SNAPSHOT/hive-odbc-0.14.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Shims Aggregator 0.14.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-shims-aggregator --- [INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/shims (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-shims-aggregator --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-shims-aggregator --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-shims-aggregator --- [INFO] Executing tasks main: [mkdir] Created dir:
[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934807#comment-13934807 ] Lefty Leverenz commented on HIVE-6578: -- Editorial nit: ... for file formats that implements ... should be implement in comment (HiveConf.java) and description (hive-default.xml.template). Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935790#comment-13935790 ] Sergey Shelukhin commented on HIVE-6578: Some minor comments on new changes, can be fixed on commit. +1 otherwise if tests pass. Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)