[jira] [Updated] (HIVE-4632) Use hadoop counter as a stat publisher
[ https://issues.apache.org/jira/browse/HIVE-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4632: Status: Open (was: Patch Available) missed DummyStatsAggregator Use hadoop counter as a stat publisher -- Key: HIVE-4632 URL: https://issues.apache.org/jira/browse/HIVE-4632 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4632.4.patch.txt, HIVE-4632.5.patch.txt Currently stats are all long/aggregation type and can be safely acquired by hadoop counter without other db or hbase. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4632) Use hadoop counter as a stat publisher
[ https://issues.apache.org/jira/browse/HIVE-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4632: Status: Patch Available (was: Open) Use hadoop counter as a stat publisher -- Key: HIVE-4632 URL: https://issues.apache.org/jira/browse/HIVE-4632 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4632.4.patch.txt, HIVE-4632.5.patch.txt Currently stats are all long/aggregation type and can be safely acquired by hadoop counter without other db or hbase. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4632) Use hadoop counter as a stat publisher
[ https://issues.apache.org/jira/browse/HIVE-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4632: Attachment: HIVE-4632.5.patch.txt Use hadoop counter as a stat publisher -- Key: HIVE-4632 URL: https://issues.apache.org/jira/browse/HIVE-4632 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4632.4.patch.txt, HIVE-4632.5.patch.txt Currently stats are all long/aggregation type and can be safely acquired by hadoop counter without other db or hbase. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4632) Use hadoop counter as a stat publisher
[ https://issues.apache.org/jira/browse/HIVE-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825175#comment-13825175 ] Phabricator commented on HIVE-4632: --- ashutoshc has requested changes to the revision HIVE-4632 [jira] Use hadoop counter as a stat publisher. Thanks for making changes. Lets also have counter as default in HiveConf.java INLINE COMMENTS common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:606 As I said on jira, lets have counter as default. REVISION DETAIL https://reviews.facebook.net/D11001 BRANCH HIVE-4632 ARCANIST PROJECT hive To: JIRA, ashutoshc, navis Cc: ashutoshc Use hadoop counter as a stat publisher -- Key: HIVE-4632 URL: https://issues.apache.org/jira/browse/HIVE-4632 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4632.4.patch.txt, HIVE-4632.5.patch.txt Currently stats are all long/aggregation type and can be safely acquired by hadoop counter without other db or hbase. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Review Request 15525: HIVE-3107: Improve semantic analyzer to better handle column name references in group by/sort by clauses
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15525/#review29037 --- Ship it! Ship It! - Ashutosh Chauhan On Nov. 18, 2013, 4:05 a.m., Harish Butani wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15525/ --- (Updated Nov. 18, 2013, 4:05 a.m.) Review request for hive, Ashutosh Chauhan and Xuefu Zhang. Bugs: hive-3107 https://issues.apache.org/jira/browse/hive-3107 Repository: hive-git Description --- Following queries all fail with various SemanticExceptions: explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java 7a7f3ef ql/src/java/org/apache/hadoop/hive/ql/parse/RowResolver.java 908546e ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java d797407 ql/src/test/queries/clientnegative/notable_alias3.q 6cc3e87 ql/src/test/queries/clientpositive/groupby_resolution.q PRE-CREATION ql/src/test/queries/clientpositive/notable_alias3.q PRE-CREATION ql/src/test/results/clientnegative/notable_alias3.q.out cadca6e ql/src/test/results/clientpositive/groupby_resolution.q.out PRE-CREATION ql/src/test/results/clientpositive/notable_alias3.q.out PRE-CREATION ql/src/test/results/compiler/errors/nonkey_groupby.q.out a13d45d ql/src/test/results/compiler/plan/groupby1.q.xml 485c323 ql/src/test/results/compiler/plan/groupby5.q.xml abdbff0 Diff: https://reviews.apache.org/r/15525/diff/ Testing --- added groupby_resolution.q that tests the jira listed. Also added tests for having and windowing. Thanks, Harish Butani
[jira] [Updated] (HIVE-4632) Use hadoop counter as a stat publisher
[ https://issues.apache.org/jira/browse/HIVE-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4632: Attachment: HIVE-4632.6.patch.txt Changed default value. Ubuntu makes me dizzy after upgrading to 12.04. Use hadoop counter as a stat publisher -- Key: HIVE-4632 URL: https://issues.apache.org/jira/browse/HIVE-4632 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4632.4.patch.txt, HIVE-4632.5.patch.txt, HIVE-4632.6.patch.txt Currently stats are all long/aggregation type and can be safely acquired by hadoop counter without other db or hbase. -- This message was sent by Atlassian JIRA (v6.1#6144)
Hive 0.13 SNAPSHOT build fail
Hi, I want to inetegrate Hive with HBase 0.96 and Hadoop 2.2. I found Hive 0.13 support them. So I checkout the 0.13 snapshot just now, but get bellow error when build. All sub components built success, but fail at Hive task. Can someone help resolve it? Thanks. [mqingping@LDEV-D042 hive]$ mvn clean -e package assembly:assembly -DskipTests .. [INFO] org/apache/hadoop/hive/shims/Hadoop23Shims$2.class already added, skipping [INFO] org/apache/hadoop/hive/shims/Jetty23Shims$1.class already added, skipping [INFO] org/apache/hadoop/mapred/WebHCatJTShim23.class already added, skipping [INFO] META-INF/maven/org.apache.hive.shims/hive-shims-0.23/pom.xml already added, skipping [INFO] META-INF/maven/org.apache.hive.shims/hive-shims-0.23/pom.properties already added, skipping [INFO] [INFO] Reactor Summary: [INFO] [INFO] Hive .. FAILURE [1.170s] [INFO] Hive Ant Utilities SUCCESS [1.857s] [INFO] Hive Shims Common . SUCCESS [0.577s] [INFO] Hive Shims 0.20 ... SUCCESS [0.353s] [INFO] Hive Shims Secure Common .. SUCCESS [0.582s] [INFO] Hive Shims 0.20S .. SUCCESS [0.700s] [INFO] Hive Shims 0.23 ... SUCCESS [0.615s] [INFO] Hive Shims SUCCESS [1.709s] [INFO] Hive Common ... SUCCESS [3.335s] [INFO] Hive Serde SUCCESS [2.588s] [INFO] Hive Metastore SUCCESS [8.542s] [INFO] Hive Query Language ... SUCCESS [17.326s] [INFO] Hive Service .. SUCCESS [1.511s] [INFO] Hive JDBC . SUCCESS [0.306s] [INFO] Hive Beeline .. SUCCESS [0.283s] [INFO] Hive CLI .. SUCCESS [0.304s] [INFO] Hive Contrib .. SUCCESS [0.303s] [INFO] Hive HBase Handler SUCCESS [0.387s] [INFO] Hive HCatalog . SUCCESS [0.067s] [INFO] Hive HCatalog Core SUCCESS [3.311s] [INFO] Hive HCatalog Pig Adapter . SUCCESS [0.343s] [INFO] Hive HCatalog Server Extensions ... SUCCESS [0.272s] [INFO] Hive HCatalog Webhcat Java Client . SUCCESS [0.265s] [INFO] Hive HCatalog Webhcat . SUCCESS [3.906s] [INFO] Hive HCatalog HBase Storage Handler ... SUCCESS [1.215s] [INFO] Hive HWI .. SUCCESS [0.182s] [INFO] Hive ODBC . SUCCESS [0.043s] [INFO] Hive Shims Aggregator . SUCCESS [0.035s] [INFO] Hive TestUtils SUCCESS [0.067s] [INFO] Hive Packaging SUCCESS [0.080s] [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 53.706s [INFO] Finished at: Mon Nov 18 18:54:13 CST 2013 [INFO] Final Memory: 90M/1423M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single (uberjar) on project hive-shims: Execution uberjar of goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single failed. NullPointerException - [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single (uberjar) on project hive-shims: Execution uberjar of goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single failed. at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:224) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.MojoExecutor.executeForkedExecutions(MojoExecutor.java:364) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:198) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) at
How do you run single query test(s) after mavenization?
I'm trying to run as per the updated Contributing guidehttps://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute: mvn test -Dtest=TestCliDriver -Dqfile=vectorized_mapjoin.q (The guide actually recommends -Dcase=TestCliDriver but using -Dcase executes all tests. In fact -Dtest=... is recommended just few lines above, I guess -Dcase=... is a typo) But the run succeeds w/o actually executing any query test (I tried removing -Dqfile= and does not make any difference). I attached the output of the mvn test -Dtest=TestCliDriver run, if it sheds any light. Thanks, ~Remus
[jira] [Updated] (HIVE-5771) Constant propagation optimizer for Hive
[ https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Xu updated HIVE-5771: - Attachment: HIVE-5771.2.patch Constant propagation optimizer for Hive --- Key: HIVE-5771 URL: https://issues.apache.org/jira/browse/HIVE-5771 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Ted Xu Assignee: Ted Xu Attachments: HIVE-5771.1.patch, HIVE-5771.2.patch, HIVE-5771.patch Currently there is no constant folding/propagation optimizer, all expressions are evaluated at runtime. HIVE-2470 did a great job on evaluating constants on UDF initializing phase, however, it is still a runtime evaluation and it doesn't propagate constants from a subquery to outside. It may reduce I/O and accelerate process if we introduce such an optimizer. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5369) Annotate hive operator tree with statistics from metastore
[ https://issues.apache.org/jira/browse/HIVE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825299#comment-13825299 ] Hive QA commented on HIVE-5369: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614341/HIVE-5369.9.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4617 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/338/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/338/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12614341 Annotate hive operator tree with statistics from metastore -- Key: HIVE-5369 URL: https://issues.apache.org/jira/browse/HIVE-5369 Project: Hive Issue Type: New Feature Components: Query Processor, Statistics Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: statistics Fix For: 0.13.0 Attachments: HIVE-5369.1.txt, HIVE-5369.2.WIP.txt, HIVE-5369.2.patch.txt, HIVE-5369.3.patch.txt, HIVE-5369.4.patch.txt, HIVE-5369.5.patch.txt, HIVE-5369.6.patch.txt, HIVE-5369.7.patch.txt, HIVE-5369.8.patch.txt, HIVE-5369.9.patch, HIVE-5369.9.patch.txt, HIVE-5369.WIP.txt, HIVE-5369.refactor.WIP.txt Currently the statistics gathered at table/partition level and column level are not used during query planning stage. Statistics at table/partition and column level can be used for optimizing the query plans. Basic statistics like uncompressed data size can be used for better reducer estimation. Other statistics like number of rows, distinct values of columns, average length of columns etc. can be used by Cost Based Optimizer (CBO) for making better query plan selection. As a first step in improving query planning the statistics that are available in the metastore should be attached to hive operator tree. The operator tree should be walked and annotated with statistics information. The attached statistics will vary for each operator depending on the operation it performs. For example, select operator will change the average row size but doesn't affect the number of rows. Similarly filter operator will change the number of rows but doesn't change the average row size. Similar rules can be applied for other operators as well. Rules for different operators are added as comments in the code. For more detailed information, the reference book that I am using is Database Systems: The Complete Book by Garcia-Molina et.al. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Hive 0.13 SNAPSHOT build fail
try mvn install -DskipTests first Best Regards Jin Jie Sent from my mobile device. On Nov 18, 2013 8:08 PM, Meng QingPing mqingp...@gmail.com wrote: Hi, I want to inetegrate Hive with HBase 0.96 and Hadoop 2.2. I found Hive 0.13 support them. So I checkout the 0.13 snapshot just now, but get bellow error when build. All sub components built success, but fail at Hive task. Can someone help resolve it? Thanks. [mqingping@LDEV-D042 hive]$ mvn clean -e package assembly:assembly -DskipTests .. [INFO] org/apache/hadoop/hive/shims/Hadoop23Shims$2.class already added, skipping [INFO] org/apache/hadoop/hive/shims/Jetty23Shims$1.class already added, skipping [INFO] org/apache/hadoop/mapred/WebHCatJTShim23.class already added, skipping [INFO] META-INF/maven/org.apache.hive.shims/hive-shims-0.23/pom.xml already added, skipping [INFO] META-INF/maven/org.apache.hive.shims/hive-shims-0.23/pom.properties already added, skipping [INFO] [INFO] Reactor Summary: [INFO] [INFO] Hive .. FAILURE [1.170s] [INFO] Hive Ant Utilities SUCCESS [1.857s] [INFO] Hive Shims Common . SUCCESS [0.577s] [INFO] Hive Shims 0.20 ... SUCCESS [0.353s] [INFO] Hive Shims Secure Common .. SUCCESS [0.582s] [INFO] Hive Shims 0.20S .. SUCCESS [0.700s] [INFO] Hive Shims 0.23 ... SUCCESS [0.615s] [INFO] Hive Shims SUCCESS [1.709s] [INFO] Hive Common ... SUCCESS [3.335s] [INFO] Hive Serde SUCCESS [2.588s] [INFO] Hive Metastore SUCCESS [8.542s] [INFO] Hive Query Language ... SUCCESS [17.326s] [INFO] Hive Service .. SUCCESS [1.511s] [INFO] Hive JDBC . SUCCESS [0.306s] [INFO] Hive Beeline .. SUCCESS [0.283s] [INFO] Hive CLI .. SUCCESS [0.304s] [INFO] Hive Contrib .. SUCCESS [0.303s] [INFO] Hive HBase Handler SUCCESS [0.387s] [INFO] Hive HCatalog . SUCCESS [0.067s] [INFO] Hive HCatalog Core SUCCESS [3.311s] [INFO] Hive HCatalog Pig Adapter . SUCCESS [0.343s] [INFO] Hive HCatalog Server Extensions ... SUCCESS [0.272s] [INFO] Hive HCatalog Webhcat Java Client . SUCCESS [0.265s] [INFO] Hive HCatalog Webhcat . SUCCESS [3.906s] [INFO] Hive HCatalog HBase Storage Handler ... SUCCESS [1.215s] [INFO] Hive HWI .. SUCCESS [0.182s] [INFO] Hive ODBC . SUCCESS [0.043s] [INFO] Hive Shims Aggregator . SUCCESS [0.035s] [INFO] Hive TestUtils SUCCESS [0.067s] [INFO] Hive Packaging SUCCESS [0.080s] [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 53.706s [INFO] Finished at: Mon Nov 18 18:54:13 CST 2013 [INFO] Final Memory: 90M/1423M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single (uberjar) on project hive-shims: Execution uberjar of goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single failed. NullPointerException - [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single (uberjar) on project hive-shims: Execution uberjar of goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single failed. at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:224) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.MojoExecutor.executeForkedExecutions(MojoExecutor.java:364) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:198) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at
RE: How do you run single query test(s) after mavenization?
Nevermind, discovered https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-HowdoIruntheclientpositive%2Fclientnegativeunittests%3F cd itests/qtest mvn test -Dtest=TestCliDriver I still get failures, but at least now I can investigate Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 31.9 sec FAILURE! - in org.apache.hadoop.hive.cli.TestCliDriver initializationError(org.apache.hadoop.hive.cli.TestCliDriver) Time elapsed: 0.005 sec FAILURE! java.lang.AssertionError: null at org.apache.hadoop.hive.ql.QTestUtil.getHdfsUriString(QTestUtil.java:288) at org.apache.hadoop.hive.ql.QTestUtil.convertPathsFromWindowsToHdfs(QTestUtil.java:276) at org.apache.hadoop.hive.ql.QTestUtil.initConf(QTestUtil.java:233) at org.apache.hadoop.hive.ql.QTestUtil.init(QTestUtil.java:317) at org.apache.hadoop.hive.cli.TestCliDriver.clinit(TestCliDriver.java:39) From: Remus Rusanu [mailto:rem...@microsoft.com] Sent: Monday, November 18, 2013 2:30 PM To: dev@hive.apache.org Cc: Ashutosh Chauhan; Tony Murphy (HDINSIGHT); Eric Hanson (SQL SERVER) Subject: How do you run single query test(s) after mavenization? I'm trying to run as per the updated Contributing guidehttps://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute: mvn test -Dtest=TestCliDriver -Dqfile=vectorized_mapjoin.q (The guide actually recommends -Dcase=TestCliDriver but using -Dcase executes all tests. In fact -Dtest=... is recommended just few lines above, I guess -Dcase=... is a typo) But the run succeeds w/o actually executing any query test (I tried removing -Dqfile= and does not make any difference). I attached the output of the mvn test -Dtest=TestCliDriver run, if it sheds any light. Thanks, ~Remus
[jira] [Commented] (HIVE-4632) Use hadoop counter as a stat publisher
[ https://issues.apache.org/jira/browse/HIVE-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825416#comment-13825416 ] Hive QA commented on HIVE-4632: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614357/HIVE-4632.6.patch.txt {color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 4609 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_aggregator_error_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_publisher_error_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_aggregator_error_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_aggregator_error_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_publisher_error_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_publisher_error_2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_case_sensitivity org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input9 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testsequencefile org.apache.hadoop.hive.ql.parse.TestParse.testParse_join1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/339/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/339/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 27 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12614357 Use hadoop counter as a stat publisher -- Key: HIVE-4632 URL: https://issues.apache.org/jira/browse/HIVE-4632 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4632.4.patch.txt, HIVE-4632.5.patch.txt, HIVE-4632.6.patch.txt Currently stats are all long/aggregation type and can be safely acquired by hadoop counter without other db or hbase. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5565) Limit Hive decimal type maximum precision and scale to 38
[ https://issues.apache.org/jira/browse/HIVE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825423#comment-13825423 ] Hive QA commented on HIVE-5565: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614198/HIVE-5565.2.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/341/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/341/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-341/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'conf/hive-default.xml.template' Reverted 'hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsAggregator.java' Reverted 'data/conf/hive-site.xml' Reverted 'itests/qtest/pom.xml' Reverted 'itests/util/src/main/java/org/apache/hadoop/hive/ql/stats/DummyStatsAggregator.java' Reverted 'itests/util/src/main/java/org/apache/hadoop/hive/ql/stats/KeyVerifyingStatsAggregator.java' Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java' Reverted 'common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestStatsPublisherEnhanced.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/stats/StatsAggregator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/stats/StatsFactory.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/assembly/target shims/0.20S/target shims/0.23/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target hcatalog/server-extensions/target hcatalog/core/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen service/target contrib/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/test/results/clientpositive/stats_counter.q.out ql/src/test/queries/clientpositive/stats_counter.q ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsAggregator.java ql/src/java/org/apache/hadoop/hive/ql/stats/CounterStatsPublisher.java + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1543062. At revision 1543062. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch Going to apply patch with: patch -p0 patching file common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java patching file common/src/test/org/apache/hadoop/hive/common/type/TestHiveDecimal.java patching file jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcStruct.java patching file ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java patching file ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java patching file ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBridge.java patching file ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java patching file
[jira] [Commented] (HIVE-5565) Limit Hive decimal type maximum precision and scale to 38
[ https://issues.apache.org/jira/browse/HIVE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825431#comment-13825431 ] Brock Noland commented on HIVE-5565: Looks like this patch needs some rebasing as DEFAULT_PRECISION is still used here: https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils.java#L423 Limit Hive decimal type maximum precision and scale to 38 - Key: HIVE-5565 URL: https://issues.apache.org/jira/browse/HIVE-5565 Project: Hive Issue Type: Task Components: Types Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5565.1.patch, HIVE-5565.2.patch, HIVE-5565.patch With HIVE-3976, the maximum precision is set to 65, and maximum scale is to 30. After discussing with several folks in the community, it's determined that 38 as a maximum for both precision and scale are probably sufficient, in addition to the potential performance boost that might become possible to some implementation. This task is to make such a change. The change is expected to be trivial, but it may impact many test cases. The reason for a separate JIRA is that patch in HIVE-3976 is already in a good shape. Rather than destabilizing a bigger patch, a dedicate patch will facilitates both reviews. The wiki document will be updated shortly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5741) Fix binary packaging build eg include hcatalog, resolve pom issues
[ https://issues.apache.org/jira/browse/HIVE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5741: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Thank you for the review! I have committed this to trunk. Fix binary packaging build eg include hcatalog, resolve pom issues -- Key: HIVE-5741 URL: https://issues.apache.org/jira/browse/HIVE-5741 Project: Hive Issue Type: Sub-task Affects Versions: 0.13.0 Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-5741.patch, HIVE-5741.patch There are a couple issues with our current binary tarball: * HCatalog is not included * We include many jars which we don't need and which break things -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5842) Fix issues with new paths to jar in hcatalog
Brock Noland created HIVE-5842: -- Summary: Fix issues with new paths to jar in hcatalog Key: HIVE-5842 URL: https://issues.apache.org/jira/browse/HIVE-5842 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland HIVE-5741 included hcatalog in the binary tarball but some of the paths to jars is slightly different requiring the scripts be updated. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5809) incorrect stats in some cases with hive.stats.autogather=true
[ https://issues.apache.org/jira/browse/HIVE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-5809: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) Committed to trunk. incorrect stats in some cases with hive.stats.autogather=true Key: HIVE-5809 URL: https://issues.apache.org/jira/browse/HIVE-5809 Project: Hive Issue Type: Bug Components: Statistics Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.13.0 Attachments: HIVE-5809.patch, HIVE-5809.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5842) Fix issues with new paths to jar in hcatalog
[ https://issues.apache.org/jira/browse/HIVE-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5842: --- Attachment: HIVE-5842.patch Fix issues with new paths to jar in hcatalog Key: HIVE-5842 URL: https://issues.apache.org/jira/browse/HIVE-5842 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5842.patch HIVE-5741 included hcatalog in the binary tarball but some of the paths to jars is slightly different requiring the scripts be updated. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15649: HIVE-5842 - Fix issues with new paths to jar in hcatalog
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15649/ --- Review request for hive. Bugs: HIVE-5842 https://issues.apache.org/jira/browse/HIVE-5842 Repository: hive-git Description --- Fixes path issues with hcatalog in maven tarball post mavenization. Also removes a comical amount of trailing whitespace in hcat scripts. Diffs - hcatalog/bin/hcat b4d4226 hcatalog/bin/hcat.py 53fc387 hcatalog/bin/hcat_server.py 51a11e6 hcatalog/bin/hcat_server.sh bf3c3f1 hcatalog/bin/hcatcfg.py 47a56d8 hcatalog/webhcat/svr/src/main/bin/webhcat_config.sh 6b0b578 hcatalog/webhcat/svr/src/main/bin/webhcat_server.sh 600c16d Diff: https://reviews.apache.org/r/15649/diff/ Testing --- Tested hcat scripts manually Thanks, Brock Noland
[jira] [Updated] (HIVE-5842) Fix issues with new paths to jar in hcatalog
[ https://issues.apache.org/jira/browse/HIVE-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-5842: --- Status: Patch Available (was: Open) Fix issues with new paths to jar in hcatalog Key: HIVE-5842 URL: https://issues.apache.org/jira/browse/HIVE-5842 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5842.patch HIVE-5741 included hcatalog in the binary tarball but some of the paths to jars is slightly different requiring the scripts be updated. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5739) Cleanup transitive dependencies
[ https://issues.apache.org/jira/browse/HIVE-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland resolved HIVE-5739. Resolution: Duplicate Cleanup transitive dependencies --- Key: HIVE-5739 URL: https://issues.apache.org/jira/browse/HIVE-5739 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Priority: Critical As you can see below we have many duplicate depends from various dependencies. We need to put the correct exclusions in place. {noformat} activation-1.1.jar ant-1.6.5.jar ant-1.9.1.jar ant-launcher-1.9.1.jar antlr-2.7.7.jar antlr-runtime-3.4.jar aopalliance-1.0.jar asm-3.1.jar asm-commons-3.1.jar asm-tree-3.1.jar avro-1.5.3.jar avro-1.7.1.jar avro-ipc-1.5.3.jar avro-ipc-1.7.1.jar avro-mapred-1.7.1.jar bonecp-0.7.1.RELEASE.jar commons-beanutils-1.7.0.jar commons-beanutils-core-1.8.0.jar commons-cli-1.2.jar commons-codec-1.3.jar commons-codec-1.4.jar commons-collections-3.1.jar commons-collections-3.2.1.jar commons-compress-1.4.1.jar commons-configuration-1.6.jar commons-daemon-1.0.13.jar commons-digester-1.8.jar commons-el-1.0.jar commons-exec-1.1.jar commons-httpclient-3.0.1.jar commons-httpclient-3.1.jar commons-io-2.1.jar commons-io-2.4.jar commons-lang-2.4.jar commons-lang-2.5.jar commons-logging-1.0.4.jar commons-math-2.1.jar commons-net-1.4.1.jar commons-net-2.0.jar commons-net-3.1.jar core-3.1.1.jar datanucleus-api-jdo-3.2.1.jar datanucleus-core-3.2.2.jar datanucleus-rdbms-3.2.1.jar derby-10.4.2.0.jar ftplet-api-1.0.0.jar ftpserver-core-1.0.0.jar ftpserver-deprecated-1.0.0-M2.jar geronimo-annotation_1.0_spec-1.1.1.jar geronimo-jaspic_1.0_spec-1.0.jar geronimo-jta_1.1_spec-1.1.1.jar gmbal-api-only-3.0.0-b023.jar grizzly-framework-2.1.1.jar grizzly-framework-2.1.1-tests.jar grizzly-http-2.1.1.jar grizzly-http-server-2.1.1.jar grizzly-http-servlet-2.1.1.jar grizzly-rcm-2.1.1.jar groovy-all-2.1.6.jar guava-11.0.2.jar guava-r08.jar guice-3.0.jar guice-servlet-3.0.jar hamcrest-core-1.1.jar hbase-0.94.6.1.jar hbase-0.94.6.1-tests.jar high-scale-lib-1.1.1.jar hive-ant-0.13.0-SNAPSHOT.jar hive-cli-0.13.0-SNAPSHOT.jar hive-common-0.13.0-SNAPSHOT.jar hive-exec-0.13.0-SNAPSHOT.jar hive-hbase-handler-0.13.0-SNAPSHOT.jar hive-hcatalog-core-0.13.0-SNAPSHOT.jar hive-metastore-0.13.0-SNAPSHOT.jar hive-serde-0.13.0-SNAPSHOT.jar hive-service-0.13.0-SNAPSHOT.jar hive-shims-0.13.0-SNAPSHOT-uberjar.jar hive-shims-0.20-0.13.0-SNAPSHOT.jar hive-shims-0.20S-0.13.0-SNAPSHOT.jar hive-shims-0.23-0.13.0-SNAPSHOT.jar hive-shims-common-0.13.0-SNAPSHOT.jar hive-shims-common-secure-0.13.0-SNAPSHOT.jar hsqldb-1.8.0.10.jar httpclient-4.1.3.jar httpcore-4.1.3.jar jackson-core-asl-1.7.1.jar jackson-core-asl-1.8.8.jar jackson-core-asl-1.9.2.jar jackson-jaxrs-1.7.1.jar jackson-jaxrs-1.8.8.jar jackson-jaxrs-1.9.2.jar jackson-mapper-asl-1.8.8.jar jackson-mapper-asl-1.9.2.jar jackson-xc-1.7.1.jar jackson-xc-1.8.8.jar jackson-xc-1.9.2.jar jamon-runtime-2.3.1.jar jasper-compiler-5.5.12.jar jasper-compiler-5.5.23.jar jasper-runtime-5.5.12.jar jasper-runtime-5.5.23.jar JavaEWAH-0.3.2.jar javax.inject-1.jar javax.servlet-3.0.jar javolution-5.5.1.jar jaxb-api-2.1.jar jaxb-api-2.2.2.jar jaxb-impl-2.2.3-1.jar jdk.tools-1.6.jar jdo-api-3.0.1.jar jersey-client-1.8.jar jersey-core-1.14.jar jersey-core-1.8.jar jersey-grizzly2-1.8.jar jersey-guice-1.8.jar jersey-json-1.14.jar jersey-json-1.8.jar jersey-server-1.14.jar jersey-server-1.8.jar jersey-servlet-1.14.jar jersey-test-framework-core-1.8.jar jersey-test-framework-grizzly2-1.8.jar jets3t-0.6.1.jar jets3t-0.7.1.jar jettison-1.1.jar jetty-6.1.14.jar jetty-6.1.26.jar jetty-all-server-7.6.0.v20120127.jar jetty-util-6.1.14.jar jetty-util-6.1.26.jar jline-0.9.94.jar jms-1.1.jar jmxri-1.2.1.jar jmxtools-1.2.1.jar jruby-complete-1.6.5.jar jsch-0.1.42.jar json-20090211.jar jsp-2.1-6.1.14.jar jsp-api-2.1-6.1.14.jar jsp-api-2.1.jar jsr305-1.3.9.jar jta-1.1.jar jul-to-slf4j-1.6.1.jar junit-3.8.1.jar junit-4.10.jar junit-4.5.jar junit-4.8.1.jar kfs-0.3.jar kryo-2.22.jar libfb303-0.9.0.jar libthrift-0.9.0.jar log4j-1.2.15.jar log4j-1.2.16.jar log4j-1.2.17.jar mail-1.4.1.jar management-api-3.0.0-b012.jar metrics-core-2.1.2.jar mina-core-2.0.0-M5.jar netty-3.2.2.Final.jar netty-3.4.0.Final.jar netty-3.5.11.Final.jar oro-2.0.8.jar paranamer-2.2.jar paranamer-2.3.jar paranamer-ant-2.2.jar paranamer-generator-2.2.jar pig-0.10.1.jar protobuf-java-2.4.0a.jar protobuf-java-2.5.0.jar qdox-1.10.1.jar servlet-api-2.5-20081211.jar servlet-api-2.5-6.1.14.jar servlet-api-2.5.jar slf4j-api-1.6.1.jar slf4j-log4j12-1.6.1.jar snappy-0.2.jar snappy-java-1.0.3.2.jar snappy-java-1.0.4.1.jar ST4-4.0.4.jar stax-api-1.0.1.jar stax-api-1.0-2.jar
[jira] [Commented] (HIVE-5739) Cleanup transitive dependencies
[ https://issues.apache.org/jira/browse/HIVE-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825473#comment-13825473 ] Brock Noland commented on HIVE-5739: HIVE-5741 already cleaned this up. Cleanup transitive dependencies --- Key: HIVE-5739 URL: https://issues.apache.org/jira/browse/HIVE-5739 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Priority: Critical As you can see below we have many duplicate depends from various dependencies. We need to put the correct exclusions in place. {noformat} activation-1.1.jar ant-1.6.5.jar ant-1.9.1.jar ant-launcher-1.9.1.jar antlr-2.7.7.jar antlr-runtime-3.4.jar aopalliance-1.0.jar asm-3.1.jar asm-commons-3.1.jar asm-tree-3.1.jar avro-1.5.3.jar avro-1.7.1.jar avro-ipc-1.5.3.jar avro-ipc-1.7.1.jar avro-mapred-1.7.1.jar bonecp-0.7.1.RELEASE.jar commons-beanutils-1.7.0.jar commons-beanutils-core-1.8.0.jar commons-cli-1.2.jar commons-codec-1.3.jar commons-codec-1.4.jar commons-collections-3.1.jar commons-collections-3.2.1.jar commons-compress-1.4.1.jar commons-configuration-1.6.jar commons-daemon-1.0.13.jar commons-digester-1.8.jar commons-el-1.0.jar commons-exec-1.1.jar commons-httpclient-3.0.1.jar commons-httpclient-3.1.jar commons-io-2.1.jar commons-io-2.4.jar commons-lang-2.4.jar commons-lang-2.5.jar commons-logging-1.0.4.jar commons-math-2.1.jar commons-net-1.4.1.jar commons-net-2.0.jar commons-net-3.1.jar core-3.1.1.jar datanucleus-api-jdo-3.2.1.jar datanucleus-core-3.2.2.jar datanucleus-rdbms-3.2.1.jar derby-10.4.2.0.jar ftplet-api-1.0.0.jar ftpserver-core-1.0.0.jar ftpserver-deprecated-1.0.0-M2.jar geronimo-annotation_1.0_spec-1.1.1.jar geronimo-jaspic_1.0_spec-1.0.jar geronimo-jta_1.1_spec-1.1.1.jar gmbal-api-only-3.0.0-b023.jar grizzly-framework-2.1.1.jar grizzly-framework-2.1.1-tests.jar grizzly-http-2.1.1.jar grizzly-http-server-2.1.1.jar grizzly-http-servlet-2.1.1.jar grizzly-rcm-2.1.1.jar groovy-all-2.1.6.jar guava-11.0.2.jar guava-r08.jar guice-3.0.jar guice-servlet-3.0.jar hamcrest-core-1.1.jar hbase-0.94.6.1.jar hbase-0.94.6.1-tests.jar high-scale-lib-1.1.1.jar hive-ant-0.13.0-SNAPSHOT.jar hive-cli-0.13.0-SNAPSHOT.jar hive-common-0.13.0-SNAPSHOT.jar hive-exec-0.13.0-SNAPSHOT.jar hive-hbase-handler-0.13.0-SNAPSHOT.jar hive-hcatalog-core-0.13.0-SNAPSHOT.jar hive-metastore-0.13.0-SNAPSHOT.jar hive-serde-0.13.0-SNAPSHOT.jar hive-service-0.13.0-SNAPSHOT.jar hive-shims-0.13.0-SNAPSHOT-uberjar.jar hive-shims-0.20-0.13.0-SNAPSHOT.jar hive-shims-0.20S-0.13.0-SNAPSHOT.jar hive-shims-0.23-0.13.0-SNAPSHOT.jar hive-shims-common-0.13.0-SNAPSHOT.jar hive-shims-common-secure-0.13.0-SNAPSHOT.jar hsqldb-1.8.0.10.jar httpclient-4.1.3.jar httpcore-4.1.3.jar jackson-core-asl-1.7.1.jar jackson-core-asl-1.8.8.jar jackson-core-asl-1.9.2.jar jackson-jaxrs-1.7.1.jar jackson-jaxrs-1.8.8.jar jackson-jaxrs-1.9.2.jar jackson-mapper-asl-1.8.8.jar jackson-mapper-asl-1.9.2.jar jackson-xc-1.7.1.jar jackson-xc-1.8.8.jar jackson-xc-1.9.2.jar jamon-runtime-2.3.1.jar jasper-compiler-5.5.12.jar jasper-compiler-5.5.23.jar jasper-runtime-5.5.12.jar jasper-runtime-5.5.23.jar JavaEWAH-0.3.2.jar javax.inject-1.jar javax.servlet-3.0.jar javolution-5.5.1.jar jaxb-api-2.1.jar jaxb-api-2.2.2.jar jaxb-impl-2.2.3-1.jar jdk.tools-1.6.jar jdo-api-3.0.1.jar jersey-client-1.8.jar jersey-core-1.14.jar jersey-core-1.8.jar jersey-grizzly2-1.8.jar jersey-guice-1.8.jar jersey-json-1.14.jar jersey-json-1.8.jar jersey-server-1.14.jar jersey-server-1.8.jar jersey-servlet-1.14.jar jersey-test-framework-core-1.8.jar jersey-test-framework-grizzly2-1.8.jar jets3t-0.6.1.jar jets3t-0.7.1.jar jettison-1.1.jar jetty-6.1.14.jar jetty-6.1.26.jar jetty-all-server-7.6.0.v20120127.jar jetty-util-6.1.14.jar jetty-util-6.1.26.jar jline-0.9.94.jar jms-1.1.jar jmxri-1.2.1.jar jmxtools-1.2.1.jar jruby-complete-1.6.5.jar jsch-0.1.42.jar json-20090211.jar jsp-2.1-6.1.14.jar jsp-api-2.1-6.1.14.jar jsp-api-2.1.jar jsr305-1.3.9.jar jta-1.1.jar jul-to-slf4j-1.6.1.jar junit-3.8.1.jar junit-4.10.jar junit-4.5.jar junit-4.8.1.jar kfs-0.3.jar kryo-2.22.jar libfb303-0.9.0.jar libthrift-0.9.0.jar log4j-1.2.15.jar log4j-1.2.16.jar log4j-1.2.17.jar mail-1.4.1.jar management-api-3.0.0-b012.jar metrics-core-2.1.2.jar mina-core-2.0.0-M5.jar netty-3.2.2.Final.jar netty-3.4.0.Final.jar netty-3.5.11.Final.jar oro-2.0.8.jar paranamer-2.2.jar paranamer-2.3.jar paranamer-ant-2.2.jar paranamer-generator-2.2.jar pig-0.10.1.jar protobuf-java-2.4.0a.jar protobuf-java-2.5.0.jar qdox-1.10.1.jar servlet-api-2.5-20081211.jar servlet-api-2.5-6.1.14.jar servlet-api-2.5.jar slf4j-api-1.6.1.jar slf4j-log4j12-1.6.1.jar snappy-0.2.jar snappy-java-1.0.3.2.jar
[jira] [Commented] (HIVE-4632) Use hadoop counter as a stat publisher
[ https://issues.apache.org/jira/browse/HIVE-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825479#comment-13825479 ] Ashutosh Chauhan commented on HIVE-4632: +1 Use hadoop counter as a stat publisher -- Key: HIVE-4632 URL: https://issues.apache.org/jira/browse/HIVE-4632 Project: Hive Issue Type: Improvement Components: Statistics Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4632.4.patch.txt, HIVE-4632.5.patch.txt, HIVE-4632.6.patch.txt Currently stats are all long/aggregation type and can be safely acquired by hadoop counter without other db or hbase. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5565) Limit Hive decimal type maximum precision and scale to 38
[ https://issues.apache.org/jira/browse/HIVE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5565: -- Attachment: HIVE-5565.3.patch Patch #3 rebased with the latest trunk. Limit Hive decimal type maximum precision and scale to 38 - Key: HIVE-5565 URL: https://issues.apache.org/jira/browse/HIVE-5565 Project: Hive Issue Type: Task Components: Types Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5565.1.patch, HIVE-5565.2.patch, HIVE-5565.3.patch, HIVE-5565.patch With HIVE-3976, the maximum precision is set to 65, and maximum scale is to 30. After discussing with several folks in the community, it's determined that 38 as a maximum for both precision and scale are probably sufficient, in addition to the potential performance boost that might become possible to some implementation. This task is to make such a change. The change is expected to be trivial, but it may impact many test cases. The reason for a separate JIRA is that patch in HIVE-3976 is already in a good shape. Rather than destabilizing a bigger patch, a dedicate patch will facilitates both reviews. The wiki document will be updated shortly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5369) Annotate hive operator tree with statistics from metastore
[ https://issues.apache.org/jira/browse/HIVE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5369: - Attachment: HIVE-5369.10.patch Fixed the failing test which was recently added. Annotate hive operator tree with statistics from metastore -- Key: HIVE-5369 URL: https://issues.apache.org/jira/browse/HIVE-5369 Project: Hive Issue Type: New Feature Components: Query Processor, Statistics Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: statistics Fix For: 0.13.0 Attachments: HIVE-5369.1.txt, HIVE-5369.10.patch, HIVE-5369.2.WIP.txt, HIVE-5369.2.patch.txt, HIVE-5369.3.patch.txt, HIVE-5369.4.patch.txt, HIVE-5369.5.patch.txt, HIVE-5369.6.patch.txt, HIVE-5369.7.patch.txt, HIVE-5369.8.patch.txt, HIVE-5369.9.patch, HIVE-5369.9.patch.txt, HIVE-5369.WIP.txt, HIVE-5369.refactor.WIP.txt Currently the statistics gathered at table/partition level and column level are not used during query planning stage. Statistics at table/partition and column level can be used for optimizing the query plans. Basic statistics like uncompressed data size can be used for better reducer estimation. Other statistics like number of rows, distinct values of columns, average length of columns etc. can be used by Cost Based Optimizer (CBO) for making better query plan selection. As a first step in improving query planning the statistics that are available in the metastore should be attached to hive operator tree. The operator tree should be walked and annotated with statistics information. The attached statistics will vary for each operator depending on the operation it performs. For example, select operator will change the average row size but doesn't affect the number of rows. Similarly filter operator will change the number of rows but doesn't change the average row size. Similar rules can be applied for other operators as well. Rules for different operators are added as comments in the code. For more detailed information, the reference book that I am using is Database Systems: The Complete Book by Garcia-Molina et.al. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5356) Move arithmatic UDFs to generic UDF implementations
[ https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5356: -- Attachment: HIVE-5356.11.patch Patch #11 rebased with latest trunk. Move arithmatic UDFs to generic UDF implementations --- Key: HIVE-5356 URL: https://issues.apache.org/jira/browse/HIVE-5356 Project: Hive Issue Type: Task Components: UDF Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5356.1.patch, HIVE-5356.10.patch, HIVE-5356.11.patch, HIVE-5356.2.patch, HIVE-5356.3.patch, HIVE-5356.4.patch, HIVE-5356.5.patch, HIVE-5356.6.patch, HIVE-5356.7.patch, HIVE-5356.8.patch, HIVE-5356.9.patch Currently, all of the arithmetic operators, such as add/sub/mult/div, are implemented as old-style UDFs and java reflection is used to determine the return type TypeInfos/ObjectInspectors, based on the return type of the evaluate() method chosen for the expression. This works fine for types that don't have type params. Hive decimal type participates in these operations just like int or double. Different from double or int, however, decimal has precision and scale, which cannot be determined by just looking at the return type (decimal) of the UDF evaluate() method, even though the operands have certain precision/scale. With the default of decimal without precision/scale, then (10, 0) will be the type params. This is certainly not desirable. To solve this problem, all of the arithmetic operators would need to be implemented as GenericUDFs, which allow returning ObjectInspector during the initialize() method. The object inspectors returned can carry type params, from which the exact return type can be determined. It's worth mentioning that, for user UDF implemented in non-generic way, if the return type of the chosen evaluate() method is decimal, the return type actually has (10,0) as precision/scale, which might not be desirable. This needs to be documented. This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit the scope of review. The remaining ones will be covered under HIVE-5706. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5771) Constant propagation optimizer for Hive
[ https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825543#comment-13825543 ] Hive QA commented on HIVE-5771: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614382/HIVE-5771.2.patch {color:red}ERROR:{color} -1 due to 49 failed/errored test(s), 4613 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_predicate_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_allchildsarenull org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_create_table_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_between org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_find_in_set org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_reverse org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_16 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_not org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_short_regress org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_casts org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_math_funcs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_string_funcs org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_ppd_key_range org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_pushdown org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_ppd_key_ranges org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/342/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/342/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 49 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12614382 Constant propagation optimizer for Hive --- Key: HIVE-5771 URL: https://issues.apache.org/jira/browse/HIVE-5771 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Ted Xu Assignee: Ted Xu Attachments: HIVE-5771.1.patch, HIVE-5771.2.patch, HIVE-5771.patch Currently there is no constant folding/propagation
[jira] [Created] (HIVE-5843) Transaction manager for Hive
Alan Gates created HIVE-5843: Summary: Transaction manager for Hive Key: HIVE-5843 URL: https://issues.apache.org/jira/browse/HIVE-5843 Project: Hive Issue Type: Sub-task Reporter: Alan Gates Assignee: Alan Gates As part of the ACID work proposed in HIVE-5317 a transaction manager is required. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support
[ https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825611#comment-13825611 ] Edward Capriolo commented on HIVE-5317: --- I have two fundamental problems with this concept. {quote} The only requirement is that the file format must be able to support a rowid. With things like text and sequence file this can be done via a byte offset. {quote} This is a good reason not to do this. Things that only work for some formats create fragmentation. What about format's that do not have a row id? What if the user is already using the key for something else like data? {quote} Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. {quote} What this ticket describes seems like a bad use case for hive. Why would the user not simply create a new table partitioned by hour? What is the need to transaction ally in-place update a table? It seems like the better solution would be for the user to log these updates themselves and then export the table with a tool like squoop periodically. I see this as a really complicated piece of work, for a narrow use case, and I have a very difficult time believing adding transactions to hive to support this is the right answer. Implement insert, update, and delete in Hive with full ACID support --- Key: HIVE-5317 URL: https://issues.apache.org/jira/browse/HIVE-5317 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: InsertUpdatesinHive.pdf Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are: * INSERT INTO tbl SELECT … * INSERT INTO tbl VALUES ... * UPDATE tbl SET … WHERE … * DELETE FROM tbl WHERE … * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... * SET TRANSACTION LEVEL … * BEGIN/END TRANSACTION Use Cases * Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. * Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance. * Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support
[ https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825619#comment-13825619 ] Edward Capriolo commented on HIVE-5317: --- By the way. I do work like this very often, and having tables that update periodically cause a lot of problems. The first is when you have to re-compute a result 4 days later. You do not want a fresh up-to-date table, you want the table as it existed 4 days ago. When you want to troubleshoot a result you do not want your intermediate tables trampled over. When you want to rebuild a months worth of results you want to launch 31 jobs in parallel not 31 jobs in series. In fact in programming hive I suggest ALWAYS partitioning this dimension tables by time and NOT doing what this ticket is describing for the reasons above (and more) Implement insert, update, and delete in Hive with full ACID support --- Key: HIVE-5317 URL: https://issues.apache.org/jira/browse/HIVE-5317 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: InsertUpdatesinHive.pdf Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are: * INSERT INTO tbl SELECT … * INSERT INTO tbl VALUES ... * UPDATE tbl SET … WHERE … * DELETE FROM tbl WHERE … * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... * SET TRANSACTION LEVEL … * BEGIN/END TRANSACTION Use Cases * Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. * Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance. * Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825647#comment-13825647 ] Roman Shaposhnik commented on HIVE-2055: Sorry for dropping by somewhat late but it looks like you've got a pretty reasonable solution with mapredcp. Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5842) Fix issues with new paths to jar in hcatalog
[ https://issues.apache.org/jira/browse/HIVE-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825652#comment-13825652 ] Hive QA commented on HIVE-5842: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614419/HIVE-5842.patch {color:green}SUCCESS:{color} +1 4609 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/343/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/343/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12614419 Fix issues with new paths to jar in hcatalog Key: HIVE-5842 URL: https://issues.apache.org/jira/browse/HIVE-5842 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5842.patch HIVE-5741 included hcatalog in the binary tarball but some of the paths to jars is slightly different requiring the scripts be updated. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5369) Annotate hive operator tree with statistics from metastore
[ https://issues.apache.org/jira/browse/HIVE-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-5369: Resolution: Fixed Status: Resolved (was: Patch Available) Thank Prasanth. Nice work! Annotate hive operator tree with statistics from metastore -- Key: HIVE-5369 URL: https://issues.apache.org/jira/browse/HIVE-5369 Project: Hive Issue Type: New Feature Components: Query Processor, Statistics Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: statistics Fix For: 0.13.0 Attachments: HIVE-5369.1.txt, HIVE-5369.10.patch, HIVE-5369.2.WIP.txt, HIVE-5369.2.patch.txt, HIVE-5369.3.patch.txt, HIVE-5369.4.patch.txt, HIVE-5369.5.patch.txt, HIVE-5369.6.patch.txt, HIVE-5369.7.patch.txt, HIVE-5369.8.patch.txt, HIVE-5369.9.patch, HIVE-5369.9.patch.txt, HIVE-5369.WIP.txt, HIVE-5369.refactor.WIP.txt Currently the statistics gathered at table/partition level and column level are not used during query planning stage. Statistics at table/partition and column level can be used for optimizing the query plans. Basic statistics like uncompressed data size can be used for better reducer estimation. Other statistics like number of rows, distinct values of columns, average length of columns etc. can be used by Cost Based Optimizer (CBO) for making better query plan selection. As a first step in improving query planning the statistics that are available in the metastore should be attached to hive operator tree. The operator tree should be walked and annotated with statistics information. The attached statistics will vary for each operator depending on the operation it performs. For example, select operator will change the average row size but doesn't affect the number of rows. Similarly filter operator will change the number of rows but doesn't change the average row size. Similar rules can be applied for other operators as well. Rules for different operators are added as comments in the code. For more detailed information, the reference book that I am using is Database Systems: The Complete Book by Garcia-Molina et.al. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-2055: --- Status: Patch Available (was: Open) Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-2055: --- Attachment: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch Here's an updated patch to the launch script based on the new hbase command. Please excuse my bash scripting; I'm not a native speaker. [~rvs] you're just in time ;) Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-3107: Attachment: HIVE-3107.6.patch remove -ve test clustern.q; move query to gby_resolution.q as a +ve test case Improve semantic analyzer to better handle column name references in group by/sort by clauses - Key: HIVE-3107 URL: https://issues.apache.org/jira/browse/HIVE-3107 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0 Reporter: Richard Ding Assignee: Harish Butani Attachments: HIVE-3107.1.patch, HIVE-3107.2.patch, HIVE-3107.3.patch, HIVE-3107.4.patch, HIVE-3107.5.patch, HIVE-3107.6.patch This is related to HIVE-1922. Following queries all fail with various SemanticExceptions: {code} explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; {code} It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-3107: Status: Open (was: Patch Available) Improve semantic analyzer to better handle column name references in group by/sort by clauses - Key: HIVE-3107 URL: https://issues.apache.org/jira/browse/HIVE-3107 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.9.0 Reporter: Richard Ding Assignee: Harish Butani Attachments: HIVE-3107.1.patch, HIVE-3107.2.patch, HIVE-3107.3.patch, HIVE-3107.4.patch, HIVE-3107.5.patch, HIVE-3107.6.patch This is related to HIVE-1922. Following queries all fail with various SemanticExceptions: {code} explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; {code} It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-3107) Improve semantic analyzer to better handle column name references in group by/sort by clauses
[ https://issues.apache.org/jira/browse/HIVE-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-3107: Status: Patch Available (was: Open) Improve semantic analyzer to better handle column name references in group by/sort by clauses - Key: HIVE-3107 URL: https://issues.apache.org/jira/browse/HIVE-3107 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.9.0 Reporter: Richard Ding Assignee: Harish Butani Attachments: HIVE-3107.1.patch, HIVE-3107.2.patch, HIVE-3107.3.patch, HIVE-3107.4.patch, HIVE-3107.5.patch, HIVE-3107.6.patch This is related to HIVE-1922. Following queries all fail with various SemanticExceptions: {code} explain select t.c from t group by c; explain select t.c from t group by c sort by t.c; explain select t.c as c0 from t group by c0; explain select t.c from t group by t.c sort by t.c; {code} It is true that one could always find a version of any of above queries that works. But one has to try to find out and it doesn't work well with machine generated SQL queries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table
[ https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825713#comment-13825713 ] Eric Hanson commented on HIVE-5795: --- Can you put the patch on ReviewBoard to make it easer to review? Please post a link to the review here. Hive should be able to skip header and footer rows when reading data file for a table - Key: HIVE-5795 URL: https://issues.apache.org/jira/browse/HIVE-5795 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5795.1.patch Hive should be able to skip header and footer lines when reading data file from table. In this way, user don't need to processing data which generated by other application with a header or footer and directly use the file for table operations. To implement this, the idea is adding new properties in table descriptions to define the number of lines in header and footer and skip them when reading the record from record reader. An DDL example for creating a table with header and footer should be like this: {code} Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties (skip.header.number=1, skip.footer.number=2); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-5839) BytesRefArrayWritable compareTo violates contract
[ https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-5839: - Assignee: Xuefu Zhang BytesRefArrayWritable compareTo violates contract - Key: HIVE-5839 URL: https://issues.apache.org/jira/browse/HIVE-5839 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Ian Robertson Assignee: Xuefu Zhang BytesRefArrayWritable's compareTo violates the compareTo contract from java.lang.Object. Specifically: * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) for all x and y. The compareTo implementation on BytesRefArrayWritable does a proper comparison of the sizes of the two instances. However, if the sizes are the same, it proceeds to do a check if both array's have the same constant. If not, it returns 1. This means that if x and y are two BytesRefArrayWritable instances with the same size, but different contents, then x.compareTo( y ) == 1 and y.compareTo( x ) == 1. Additionally, the comparison of contents is order agnostic. This seems wrong, since order of entries should matter. It is also very inefficient, running at O(n^2), where n is the number of entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5839) BytesRefArrayWritable compareTo violates contract
[ https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5839: -- Attachment: HIVE-5839.patch BytesRefArrayWritable compareTo violates contract - Key: HIVE-5839 URL: https://issues.apache.org/jira/browse/HIVE-5839 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Ian Robertson Assignee: Xuefu Zhang Attachments: HIVE-5839.patch BytesRefArrayWritable's compareTo violates the compareTo contract from java.lang.Object. Specifically: * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) for all x and y. The compareTo implementation on BytesRefArrayWritable does a proper comparison of the sizes of the two instances. However, if the sizes are the same, it proceeds to do a check if both array's have the same constant. If not, it returns 1. This means that if x and y are two BytesRefArrayWritable instances with the same size, but different contents, then x.compareTo( y ) == 1 and y.compareTo( x ) == 1. Additionally, the comparison of contents is order agnostic. This seems wrong, since order of entries should matter. It is also very inefficient, running at O(n^2), where n is the number of entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5839) BytesRefArrayWritable compareTo violates contract
[ https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-5839: -- Affects Version/s: 0.12.0 Status: Patch Available (was: Open) The first patch adjusts the behavior, with new tests pending. However, I'd like to see what the change impacts without knowing the rationale behind the original implementation. Let's see how the test goes. BytesRefArrayWritable compareTo violates contract - Key: HIVE-5839 URL: https://issues.apache.org/jira/browse/HIVE-5839 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.12.0, 0.11.0 Reporter: Ian Robertson Assignee: Xuefu Zhang Attachments: HIVE-5839.patch BytesRefArrayWritable's compareTo violates the compareTo contract from java.lang.Object. Specifically: * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) for all x and y. The compareTo implementation on BytesRefArrayWritable does a proper comparison of the sizes of the two instances. However, if the sizes are the same, it proceeds to do a check if both array's have the same constant. If not, it returns 1. This means that if x and y are two BytesRefArrayWritable instances with the same size, but different contents, then x.compareTo( y ) == 1 and y.compareTo( x ) == 1. Additionally, the comparison of contents is order agnostic. This seems wrong, since order of entries should matter. It is also very inefficient, running at O(n^2), where n is the number of entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-5755: - Attachment: HIVE-5755.2.patch Hi [~brocknoland], I did some more tweaking around with the maven flags and also developing a plan for how the dependencies should look. For the most part, things look right. Given that we package all the shims and choose one depending on the hadoop version that is available on the classpath, the dependencies within shims and the dependencies on the shims in other modules look right. The qtest profiles also include the right jars. However, the issue seems to be with the transitive dependencies being pulled in from the hive-it-util. Once I changed the hadoop and hbase dependencies in the hive-it-util target to optional, we get the behavior we expect. The profile flags seem to be taking effect in the right way now. Not sure what exactly changed but I did clear my .m2 cache a few times. Attaching a patch for reference. Please take a look and let me know what you think. Thanks Vikram. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825734#comment-13825734 ] Ashutosh Chauhan commented on HIVE-2055: We dont want hbase conf and jars to take precedence over rest of classpath. So, instead of + export HADOOP_CLASSPATH=${HBASE_CONF_DIR}:${HADOOP_CLASSPATH} +export HADOOP_CLASSPATH=${x}:${HADOOP_CLASSPATH} do + export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CONF_DIR}: +export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${x} Rest of patch looks good. Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: How do you run single query test(s) after mavenization?
Thanks for the typo alert Remus, I've changed -Dcase=TestCliDriver to -Dtest=TestCliDriver. But HowToContributehttps://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContributestill has several instances of ant that should be changed to mvn -- some are simple replacements but others might need additional changes: - Check for new Checkstyle http://checkstyle.sourceforge.net/ violations by running ant checkstyle, ... [mvn checkstyle?] - Define methods within your class whose names begin with test, and call JUnit's many assert methods to verify conditions; these methods will be executed when you run ant test. [simple replacement] - (2 ants) We can run ant test -Dtestcase=TestAbc where TestAbc is the name of the new class. This will test only the new testcase, which will be faster than ant test which tests all testcases. [change ant to mvn twice; also change -Dtestcase to -Dtest?] - Folks should run ant clean package test before selecting *Submit Patch*. [mvn clean package?] The rest of the ant instances are okay because the MVN section afterwards gives the alternative, but should we keep ant or make the replacements? - 9. Now you can run the ant 'thriftif' target ... - 11. ant thriftif -Dthrift.home=... - 15. ant thriftif - 18. ant clean package - The maven equivalent of ant thriftif is: mvn clean install -Pthriftif -DskipTests -Dthrift.home=/usr/local -- Lefty On Mon, Nov 18, 2013 at 9:35 AM, Remus Rusanu rem...@microsoft.com wrote: Nevermind, discovered https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-HowdoIruntheclientpositive%2Fclientnegativeunittests%3F cd itests/qtest mvn test -Dtest=TestCliDriver I still get failures, but at least now I can investigate Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 31.9 sec FAILURE! - in org.apache.hadoop.hive.cli.TestCliDriver initializationError(org.apache.hadoop.hive.cli.TestCliDriver) Time elapsed: 0.005 sec FAILURE! java.lang.AssertionError: null at org.apache.hadoop.hive.ql.QTestUtil.getHdfsUriString(QTestUtil.java:288) at org.apache.hadoop.hive.ql.QTestUtil.convertPathsFromWindowsToHdfs(QTestUtil.java:276) at org.apache.hadoop.hive.ql.QTestUtil.initConf(QTestUtil.java:233) at org.apache.hadoop.hive.ql.QTestUtil.init(QTestUtil.java:317) at org.apache.hadoop.hive.cli.TestCliDriver.clinit(TestCliDriver.java:39) From: Remus Rusanu [mailto:rem...@microsoft.com] Sent: Monday, November 18, 2013 2:30 PM To: dev@hive.apache.org Cc: Ashutosh Chauhan; Tony Murphy (HDINSIGHT); Eric Hanson (SQL SERVER) Subject: How do you run single query test(s) after mavenization? I'm trying to run as per the updated Contributing guide https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute : mvn test -Dtest=TestCliDriver -Dqfile=vectorized_mapjoin.q (The guide actually recommends -Dcase=TestCliDriver but using -Dcase executes all tests. In fact -Dtest=... is recommended just few lines above, I guess -Dcase=... is a typo) But the run succeeds w/o actually executing any query test (I tried removing -Dqfile= and does not make any difference). I attached the output of the mvn test -Dtest=TestCliDriver run, if it sheds any light. Thanks, ~Remus
[jira] [Updated] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-2055: --- Attachment: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch Updating patch according to Ashutosh's comments. Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5565) Limit Hive decimal type maximum precision and scale to 38
[ https://issues.apache.org/jira/browse/HIVE-5565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825749#comment-13825749 ] Hive QA commented on HIVE-5565: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614425/HIVE-5565.3.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4617 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_when {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/344/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/344/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12614425 Limit Hive decimal type maximum precision and scale to 38 - Key: HIVE-5565 URL: https://issues.apache.org/jira/browse/HIVE-5565 Project: Hive Issue Type: Task Components: Types Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-5565.1.patch, HIVE-5565.2.patch, HIVE-5565.3.patch, HIVE-5565.patch With HIVE-3976, the maximum precision is set to 65, and maximum scale is to 30. After discussing with several folks in the community, it's determined that 38 as a maximum for both precision and scale are probably sufficient, in addition to the potential performance boost that might become possible to some implementation. This task is to make such a change. The change is expected to be trivial, but it may impact many test cases. The reason for a separate JIRA is that patch in HIVE-3976 is already in a good shape. Rather than destabilizing a bigger patch, a dedicate patch will facilitates both reviews. The wiki document will be updated shortly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5755) Fix hadoop2 execution environment
[ https://issues.apache.org/jira/browse/HIVE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825752#comment-13825752 ] Brock Noland commented on HIVE-5755: Hi, Thanks again for looking at this! The patch seems to solve the issue for itest, but the main project is still including hadoop-core. For example, if apply the patch and execute: {noformat} mvn dependency:tree -Phadoop-2 {noformat} I get: {noformat} ... [INFO] [INFO] Building Hive HCatalog Pig Adapter 0.13.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-dependency-plugin:2.1:tree (default-cli) @ hive-hcatalog-pig-adapter --- [INFO] org.apache.hive.hcatalog:hive-hcatalog-pig-adapter:jar:0.13.0-SNAPSHOT [INFO] +- org.apache.hive.hcatalog:hive-hcatalog-core:jar:0.13.0-SNAPSHOT:compile [INFO] | +- org.apache.hadoop:hadoop-core:jar:1.2.1:compile ... {noformat} Note that hadoop-core is still included. I think we need to remove the active by default business. Fix hadoop2 execution environment - Key: HIVE-5755 URL: https://issues.apache.org/jira/browse/HIVE-5755 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5755.1.patch, HIVE-5755.2.patch, HIVE-5755.try.patch It looks like the hadoop2 execution environment isn't exactly correct post mavenization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825758#comment-13825758 ] Ashutosh Chauhan commented on HIVE-2055: +1 Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-2055) Hive should add HBase classpath dependencies when available
[ https://issues.apache.org/jira/browse/HIVE-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-2055: --- Assignee: Nick Dimiduk Hive should add HBase classpath dependencies when available --- Key: HIVE-2055 URL: https://issues.apache.org/jira/browse/HIVE-2055 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.10.0 Reporter: sajith v Assignee: Nick Dimiduk Attachments: 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, 0001-HIVE-2055-include-hbase-dependencies-in-launch-scrip.patch, HIVE-2055.patch Created an external table in hive , which points to the HBase table. When tried to query a column using the column name in select clause got the following exception : ( java.lang.ClassNotFoundException: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat), errorCode:12, SQLState:42000) -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15654: Rewrite Trim and Pad UDFs based on GenericUDF
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15654/ --- Review request for hive, Ashutosh Chauhan, Carl Steinbach, and Jitendra Pandey. Bugs: HIVE-5829 https://issues.apache.org/jira/browse/HIVE-5829 Repository: hive-git Description --- Rewrite the UDFS *pads and *trim using GenericUDF. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 5eb321c ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 7c1ab0d ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLTrim.java dc00cf9 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLpad.java d1da19a ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRTrim.java 2bcc5fa ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRpad.java 9652ce2 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTrim.java 490886d ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBasePad.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseTrim.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLTrim.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLpad.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRTrim.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRpad.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrim.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizationContext.java 3f3e67f ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFLTrim.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFLpad.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFRTrim.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFRpad.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFTrim.java PRE-CREATION Diff: https://reviews.apache.org/r/15654/diff/ Testing --- Thanks, Mohammad Islam
[jira] [Updated] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated HIVE-5829: Status: Patch Available (was: Open) Rewrite Trim and Pad UDFs based on GenericUDF - Key: HIVE-5829 URL: https://issues.apache.org/jira/browse/HIVE-5829 Project: Hive Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5829.1.patch This JIRA includes following UDFs: 1. trim() 2. ltrim() 3. rtrim() 4. lpad() 5. rpad() -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated HIVE-5829: Attachment: HIVE-5829.1.patch Also updated to RB: https://reviews.apache.org/r/15654/ Rewrite Trim and Pad UDFs based on GenericUDF - Key: HIVE-5829 URL: https://issues.apache.org/jira/browse/HIVE-5829 Project: Hive Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5829.1.patch This JIRA includes following UDFs: 1. trim() 2. ltrim() 3. rtrim() 4. lpad() 5. rpad() -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5356) Move arithmatic UDFs to generic UDF implementations
[ https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825781#comment-13825781 ] Hive QA commented on HIVE-5356: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614435/HIVE-5356.11.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4665 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_when {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/345/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/345/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12614435 Move arithmatic UDFs to generic UDF implementations --- Key: HIVE-5356 URL: https://issues.apache.org/jira/browse/HIVE-5356 Project: Hive Issue Type: Task Components: UDF Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5356.1.patch, HIVE-5356.10.patch, HIVE-5356.11.patch, HIVE-5356.2.patch, HIVE-5356.3.patch, HIVE-5356.4.patch, HIVE-5356.5.patch, HIVE-5356.6.patch, HIVE-5356.7.patch, HIVE-5356.8.patch, HIVE-5356.9.patch Currently, all of the arithmetic operators, such as add/sub/mult/div, are implemented as old-style UDFs and java reflection is used to determine the return type TypeInfos/ObjectInspectors, based on the return type of the evaluate() method chosen for the expression. This works fine for types that don't have type params. Hive decimal type participates in these operations just like int or double. Different from double or int, however, decimal has precision and scale, which cannot be determined by just looking at the return type (decimal) of the UDF evaluate() method, even though the operands have certain precision/scale. With the default of decimal without precision/scale, then (10, 0) will be the type params. This is certainly not desirable. To solve this problem, all of the arithmetic operators would need to be implemented as GenericUDFs, which allow returning ObjectInspector during the initialize() method. The object inspectors returned can carry type params, from which the exact return type can be determined. It's worth mentioning that, for user UDF implemented in non-generic way, if the return type of the chosen evaluate() method is decimal, the return type actually has (10,0) as precision/scale, which might not be desirable. This needs to be documented. This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit the scope of review. The remaining ones will be covered under HIVE-5706. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5839) BytesRefArrayWritable compareTo violates contract
[ https://issues.apache.org/jira/browse/HIVE-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825820#comment-13825820 ] Hive QA commented on HIVE-5839: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614469/HIVE-5839.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4617 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default org.apache.hadoop.hive.ql.io.TestRCFile.testWriteAndPartialRead {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/347/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/347/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12614469 BytesRefArrayWritable compareTo violates contract - Key: HIVE-5839 URL: https://issues.apache.org/jira/browse/HIVE-5839 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.11.0, 0.12.0 Reporter: Ian Robertson Assignee: Xuefu Zhang Attachments: HIVE-5839.patch BytesRefArrayWritable's compareTo violates the compareTo contract from java.lang.Object. Specifically: * The implementor must ensure sgn(x.compareTo( y )) == -sgn(y.compareTo( x )) for all x and y. The compareTo implementation on BytesRefArrayWritable does a proper comparison of the sizes of the two instances. However, if the sizes are the same, it proceeds to do a check if both array's have the same constant. If not, it returns 1. This means that if x and y are two BytesRefArrayWritable instances with the same size, but different contents, then x.compareTo( y ) == 1 and y.compareTo( x ) == 1. Additionally, the comparison of contents is order agnostic. This seems wrong, since order of entries should matter. It is also very inefficient, running at O(n^2), where n is the number of entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5843) Transaction manager for Hive
[ https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-5843: - Attachment: HiveTransactionManagerDetailedDesign (1).pdf Design doc Transaction manager for Hive Key: HIVE-5843 URL: https://issues.apache.org/jira/browse/HIVE-5843 Project: Hive Issue Type: Sub-task Reporter: Alan Gates Assignee: Alan Gates Attachments: HiveTransactionManagerDetailedDesign (1).pdf As part of the ACID work proposed in HIVE-5317 a transaction manager is required. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5844) dynamic_partition_skip_default.q test fails on trunk
Prasanth J created HIVE-5844: Summary: dynamic_partition_skip_default.q test fails on trunk Key: HIVE-5844 URL: https://issues.apache.org/jira/browse/HIVE-5844 Project: Hive Issue Type: Bug Reporter: Prasanth J HIVE-5369 changes explain extended output to add statistics information. This breaks dynamic_partition_skip_default.q file on trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5844) dynamic_partition_skip_default.q test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5844: - Attachment: HIVE-5844.1.patch.txt Regenerated golden file for dynamic_partition_skip_default.q file. dynamic_partition_skip_default.q test fails on trunk Key: HIVE-5844 URL: https://issues.apache.org/jira/browse/HIVE-5844 Project: Hive Issue Type: Bug Reporter: Prasanth J Attachments: HIVE-5844.1.patch.txt HIVE-5369 changes explain extended output to add statistics information. This breaks dynamic_partition_skip_default.q file on trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5400) Allow admins to disable compile and other commands
[ https://issues.apache.org/jira/browse/HIVE-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825887#comment-13825887 ] Thejas M Nair commented on HIVE-5400: - [~le...@hortonworks.com] Yes, that section sounds good. It belongs to the subsection Hive Client Security. Thanks for updating the docs! Allow admins to disable compile and other commands -- Key: HIVE-5400 URL: https://issues.apache.org/jira/browse/HIVE-5400 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-5400.patch, HIVE-5400.patch, HIVE-5400.patch From here: https://issues.apache.org/jira/browse/HIVE-5253?focusedCommentId=13782220page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13782220 I think we should afford admins who want to disable this functionality the ability to do so. Since such admins might want to disable other commands such as add or dfs, it wouldn't be much trouble to allow them to do this as well. For example we could have a configuration option hive.available.commands (or similar) which specified add,set,delete,reset, etc by default. Then check this value in CommandProcessorFactory. It would probably make sense to add this property to the restrict list. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5845) CTAS failed on vectorized code path
Ashutosh Chauhan created HIVE-5845: -- Summary: CTAS failed on vectorized code path Key: HIVE-5845 URL: https://issues.apache.org/jira/browse/HIVE-5845 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5515) Writing to an HBase table throws IllegalArgumentException, failing job submission
[ https://issues.apache.org/jira/browse/HIVE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825923#comment-13825923 ] Sushanth Sowmyan commented on HIVE-5515: Hi Viraj, I've done some more tests, and am +1 with the solution. A couple of nitpicks though, apart from my request for tests. a) Please do not use conf directly, stick with getConf() b) Please try to limit the length of individual lines, having lines with 220 chars make for more unreadable code - I believe hive actually has a stylecheck rule for length, but I forget what's the limit (checkstyle.xml says 2000, but that's ridiculous) - I would try to keep it within about 80 if possible, Writing to an HBase table throws IllegalArgumentException, failing job submission - Key: HIVE-5515 URL: https://issues.apache.org/jira/browse/HIVE-5515 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Environment: Hadoop2, Hive 0.12.0, HBase-0.96RC Reporter: Nick Dimiduk Assignee: Viraj Bhat Labels: hbase Fix For: 0.13.0 Attachments: HIVE-5515.patch Inserting data into HBase table via hive query fails with the following message: {noformat} $ hive -e FROM pgc INSERT OVERWRITE TABLE pagecounts_hbase SELECT pgc.* WHERE rowkey LIKE 'en/q%' LIMIT 10; ... Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number java.lang.IllegalArgumentException: Property value must not be null at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.conf.Configuration.set(Configuration.java:810) at org.apache.hadoop.conf.Configuration.set(Configuration.java:792) at org.apache.hadoop.hive.ql.exec.Utilities.copyTableJobPropertiesToConf(Utilities.java:2002) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:947) at org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(HiveOutputFormatImpl.java:67) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:731) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at
[jira] [Commented] (HIVE-5845) CTAS failed on vectorized code path
[ https://issues.apache.org/jira/browse/HIVE-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825924#comment-13825924 ] Ashutosh Chauhan commented on HIVE-5845: Stack-trace: {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to [Ljava.lang.Object; at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldData(StandardStructObjectInspector.java:173) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1349) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:1962) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:78) at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:159) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} CTAS failed on vectorized code path --- Key: HIVE-5845 URL: https://issues.apache.org/jira/browse/HIVE-5845 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan Following query fails: create table store_sales_2 stored as orc as select * from alltypesorc; -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5846) Analyze command fails with vectorization on
Ashutosh Chauhan created HIVE-5846: -- Summary: Analyze command fails with vectorization on Key: HIVE-5846 URL: https://issues.apache.org/jira/browse/HIVE-5846 Project: Hive Issue Type: Bug Reporter: Ashutosh Chauhan analyze table alltypesorc compute statistics; fails -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5847) DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
Jason Dere created HIVE-5847: Summary: DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal Key: HIVE-5847 URL: https://issues.apache.org/jira/browse/HIVE-5847 Project: Hive Issue Type: Bug Components: JDBC Reporter: Jason Dere Assignee: Jason Dere column_size, decimal_digits, num_prec_radix should be set appropriately based on the type qualifiers. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5848) Hive's type promotion isn't correct
Xuefu Zhang created HIVE-5848: - Summary: Hive's type promotion isn't correct Key: HIVE-5848 URL: https://issues.apache.org/jira/browse/HIVE-5848 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0, 0.11.0, 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang When dealing with union all, arithmetic operators, and other places when type promotion is needed or a common type is determined, Hive would promote non-exact data types (float and double) to HiveDecimal. However, HIveDecimal is an exact type. Promoting a non-exact type to an exact type makes a false impression to the user that the data is exact. For instance, expression 3.14 + 3.14BD produces an HiveDecimal number 6.28. However, the two are not equivalent, as the left operand is not exact. MySQL in this case produces a double 6.28, which is more reasonable. The problem was discovered in HIVE-3976. HIVE-5356 solves the problem for arithmetic operators, but there are more places where the problem exists. For instance, HIVE-5825 manifested the same issue. The purpose of this JIRA is to revisit the type casting and type promotion to make HIVE's behavior more in line with standard or other major database implementations. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5515) Writing to an HBase table throws IllegalArgumentException, failing job submission
[ https://issues.apache.org/jira/browse/HIVE-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825963#comment-13825963 ] Viraj Bhat commented on HIVE-5515: -- Hi Sushanth, Thanks for your comments about the unit test case. I think the current test cases do not exercise the path where they read the metadata from the metastore. Also about fixing the patch. Let me use conf and also limit the individual lines to 80. I will repost it as soon as possible. Viraj Writing to an HBase table throws IllegalArgumentException, failing job submission - Key: HIVE-5515 URL: https://issues.apache.org/jira/browse/HIVE-5515 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Environment: Hadoop2, Hive 0.12.0, HBase-0.96RC Reporter: Nick Dimiduk Assignee: Viraj Bhat Labels: hbase Fix For: 0.13.0 Attachments: HIVE-5515.patch Inserting data into HBase table via hive query fails with the following message: {noformat} $ hive -e FROM pgc INSERT OVERWRITE TABLE pagecounts_hbase SELECT pgc.* WHERE rowkey LIKE 'en/q%' LIMIT 10; ... Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number java.lang.IllegalArgumentException: Property value must not be null at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.conf.Configuration.set(Configuration.java:810) at org.apache.hadoop.conf.Configuration.set(Configuration.java:792) at org.apache.hadoop.hive.ql.exec.Utilities.copyTableJobPropertiesToConf(Utilities.java:2002) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:947) at org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(HiveOutputFormatImpl.java:67) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:731) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
Re: Review Request 15649: HIVE-5842 - Fix issues with new paths to jar in hcatalog
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15649/#review29087 --- +1 (with minor comments) hcatalog/bin/hcat https://reviews.apache.org/r/15649/#comment56209 Same as above: Create a new variable and use it. hcatalog/bin/hcat.py https://reviews.apache.org/r/15649/#comment56208 If possible, creating a variable and use the variable would make it much better.| - Mohammad Islam On Nov. 18, 2013, 4:49 p.m., Brock Noland wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15649/ --- (Updated Nov. 18, 2013, 4:49 p.m.) Review request for hive. Bugs: HIVE-5842 https://issues.apache.org/jira/browse/HIVE-5842 Repository: hive-git Description --- Fixes path issues with hcatalog in maven tarball post mavenization. Also removes a comical amount of trailing whitespace in hcat scripts. Diffs - hcatalog/bin/hcat b4d4226 hcatalog/bin/hcat.py 53fc387 hcatalog/bin/hcat_server.py 51a11e6 hcatalog/bin/hcat_server.sh bf3c3f1 hcatalog/bin/hcatcfg.py 47a56d8 hcatalog/webhcat/svr/src/main/bin/webhcat_config.sh 6b0b578 hcatalog/webhcat/svr/src/main/bin/webhcat_server.sh 600c16d Diff: https://reviews.apache.org/r/15649/diff/ Testing --- Tested hcat scripts manually Thanks, Brock Noland
[jira] [Updated] (HIVE-5844) dynamic_partition_skip_default.q test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5844: - Priority: Trivial (was: Major) dynamic_partition_skip_default.q test fails on trunk Key: HIVE-5844 URL: https://issues.apache.org/jira/browse/HIVE-5844 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Priority: Trivial Attachments: HIVE-5844.1.patch.txt HIVE-5369 changes explain extended output to add statistics information. This breaks dynamic_partition_skip_default.q file on trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5844) dynamic_partition_skip_default.q test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826008#comment-13826008 ] Prasanth J commented on HIVE-5844: -- I just tested locally on Mac OSX. It seems to run fine after this patch. [~ashutoshc] is it failing on Mac or other OS? dynamic_partition_skip_default.q test fails on trunk Key: HIVE-5844 URL: https://issues.apache.org/jira/browse/HIVE-5844 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Attachments: HIVE-5844.1.patch.txt HIVE-5369 changes explain extended output to add statistics information. This breaks dynamic_partition_skip_default.q file on trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-5844) dynamic_partition_skip_default.q test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J reassigned HIVE-5844: Assignee: Prasanth J dynamic_partition_skip_default.q test fails on trunk Key: HIVE-5844 URL: https://issues.apache.org/jira/browse/HIVE-5844 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Trivial Attachments: HIVE-5844.1.patch.txt HIVE-5369 changes explain extended output to add statistics information. This breaks dynamic_partition_skip_default.q file on trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support
[ https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826009#comment-13826009 ] Owen O'Malley commented on HIVE-5317: - Ed, If you don't use the insert, update, and delete commands, they won't impact your use of Hive. On the other hand, there are a wide number of users who need ACID and updates. Implement insert, update, and delete in Hive with full ACID support --- Key: HIVE-5317 URL: https://issues.apache.org/jira/browse/HIVE-5317 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: InsertUpdatesinHive.pdf Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are: * INSERT INTO tbl SELECT … * INSERT INTO tbl VALUES ... * UPDATE tbl SET … WHERE … * DELETE FROM tbl WHERE … * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... * SET TRANSACTION LEVEL … * BEGIN/END TRANSACTION Use Cases * Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. * Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance. * Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5807) Implement vectorization support for IF conditional expression for string inputs
[ https://issues.apache.org/jira/browse/HIVE-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-5807: -- Attachment: hive5807.string-IF-and-fixes.patch.txt Adds support for IF on strings, and related tests. Implement vectorization support for IF conditional expression for string inputs --- Key: HIVE-5807 URL: https://issues.apache.org/jira/browse/HIVE-5807 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: hive5807.string-IF-and-fixes.patch.txt -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15663: Hive should be able to skip header and footer rows when reading data file for a table
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15663/ --- Review request for hive and Thejas Nair. Repository: hive-git Description --- Hive should be able to skip header and footer rows when reading data file for a table (I am uploading this on behalf of Shuaishuai Nie since he's not in the office) Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 32ab3d8 data/files/header_footer_table_1/0001.txt PRE-CREATION data/files/header_footer_table_1/0002.txt PRE-CREATION data/files/header_footer_table_1/0003.txt PRE-CREATION data/files/header_footer_table_2/2012/01/01/0001.txt PRE-CREATION data/files/header_footer_table_2/2012/01/02/0002.txt PRE-CREATION data/files/header_footer_table_2/2012/01/03/0003.txt PRE-CREATION itests/qtest/pom.xml a453d8a ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 5abcfc1 ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java dd5cb6b ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 0ec6e63 ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java 85dd975 ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 0686d9b ql/src/test/queries/clientpositive/file_with_header_footer.q PRE-CREATION ql/src/test/results/clientpositive/file_with_header_footer.q.out PRE-CREATION Diff: https://reviews.apache.org/r/15663/diff/ Testing --- Thanks, Eric Hanson
Re: Review Request 15663: Hive should be able to skip header and footer rows when reading data file for a table
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15663/ --- (Updated Nov. 19, 2013, 1:31 a.m.) Review request for hive and Thejas Nair. Bugs: HIVE-5795 https://issues.apache.org/jira/browse/HIVE-5795 Repository: hive-git Description --- Hive should be able to skip header and footer rows when reading data file for a table (I am uploading this on behalf of Shuaishuai Nie since he's not in the office) Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 32ab3d8 data/files/header_footer_table_1/0001.txt PRE-CREATION data/files/header_footer_table_1/0002.txt PRE-CREATION data/files/header_footer_table_1/0003.txt PRE-CREATION data/files/header_footer_table_2/2012/01/01/0001.txt PRE-CREATION data/files/header_footer_table_2/2012/01/02/0002.txt PRE-CREATION data/files/header_footer_table_2/2012/01/03/0003.txt PRE-CREATION itests/qtest/pom.xml a453d8a ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 5abcfc1 ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java dd5cb6b ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 0ec6e63 ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java 85dd975 ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 0686d9b ql/src/test/queries/clientpositive/file_with_header_footer.q PRE-CREATION ql/src/test/results/clientpositive/file_with_header_footer.q.out PRE-CREATION Diff: https://reviews.apache.org/r/15663/diff/ Testing --- Thanks, Eric Hanson
[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table
[ https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826054#comment-13826054 ] Eric Hanson commented on HIVE-5795: --- Code review at: https://reviews.apache.org/r/15663/ Shuaishuai is not here today so I'm uploading it for him. Hive should be able to skip header and footer rows when reading data file for a table - Key: HIVE-5795 URL: https://issues.apache.org/jira/browse/HIVE-5795 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5795.1.patch Hive should be able to skip header and footer lines when reading data file from table. In this way, user don't need to processing data which generated by other application with a header or footer and directly use the file for table operations. To implement this, the idea is adding new properties in table descriptions to define the number of lines in header and footer and skip them when reading the record from record reader. An DDL example for creating a table with header and footer should be like this: {code} Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties (skip.header.number=1, skip.footer.number=2); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5635) WebHCatJTShim23 ignores security/user context
[ https://issues.apache.org/jira/browse/HIVE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5635: - Status: Open (was: Patch Available) WebHCatJTShim23 ignores security/user context - Key: HIVE-5635 URL: https://issues.apache.org/jira/browse/HIVE-5635 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5635.2.patch, HIVE-5635.patch WebHCatJTShim23 takes UserGroupInformation object as argument (which represents the user make the call to WebHCat or doAs user) but ignores. WebHCatJTShim20S uses the UserGroupInformation This is inconsistent and may be a security hole because in with Hadoop 2 the methods on WebHCatJTShim are likely running with 'hcat' as the user context. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5635) WebHCatJTShim23 ignores security/user context
[ https://issues.apache.org/jira/browse/HIVE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5635: - Status: Patch Available (was: Open) WebHCatJTShim23 ignores security/user context - Key: HIVE-5635 URL: https://issues.apache.org/jira/browse/HIVE-5635 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5635.2.patch, HIVE-5635.patch WebHCatJTShim23 takes UserGroupInformation object as argument (which represents the user make the call to WebHCat or doAs user) but ignores. WebHCatJTShim20S uses the UserGroupInformation This is inconsistent and may be a security hole because in with Hadoop 2 the methods on WebHCatJTShim are likely running with 'hcat' as the user context. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Hive 0.13 SNAPSHOT build fail
Hi Jin Jie, Thanks for your reply. mvn install -DskipTests run successfully. Then mvn clean package -DskipTests -Pdist also successfully. Thanks, Jack [INFO] Reactor Summary: [INFO] [INFO] Hive .. SUCCESS [4.293s] [INFO] Hive Ant Utilities SUCCESS [1.981s] [INFO] Hive Shims Common . SUCCESS [1.328s] [INFO] Hive Shims 0.20 ... SUCCESS [1.414s] [INFO] Hive Shims Secure Common .. SUCCESS [0.866s] [INFO] Hive Shims 0.20S .. SUCCESS [0.201s] [INFO] Hive Shims 0.23 ... SUCCESS [0.774s] [INFO] Hive Shims SUCCESS [1.235s] [INFO] Hive Common ... SUCCESS [9.325s] [INFO] Hive Serde SUCCESS [0.530s] [INFO] Hive Metastore SUCCESS [3.954s] [INFO] Hive Query Language ... SUCCESS [6.819s] [INFO] Hive Service .. SUCCESS [0.209s] [INFO] Hive JDBC . SUCCESS [0.105s] [INFO] Hive Beeline .. SUCCESS [0.172s] [INFO] Hive CLI .. SUCCESS [0.325s] [INFO] Hive Contrib .. SUCCESS [0.284s] [INFO] Hive HBase Handler SUCCESS [0.531s] [INFO] Hive HCatalog . SUCCESS [0.170s] [INFO] Hive HCatalog Core SUCCESS [0.187s] [INFO] Hive HCatalog Pig Adapter . SUCCESS [0.140s] [INFO] Hive HCatalog Server Extensions ... SUCCESS [0.191s] [INFO] Hive HCatalog Webhcat Java Client . SUCCESS [0.202s] [INFO] Hive HCatalog Webhcat . SUCCESS [5.702s] [INFO] Hive HCatalog HBase Storage Handler ... SUCCESS [0.237s] [INFO] Hive HWI .. SUCCESS [0.159s] [INFO] Hive ODBC . SUCCESS [0.073s] [INFO] Hive Shims Aggregator . SUCCESS [0.047s] [INFO] Hive TestUtils SUCCESS [0.097s] [INFO] Hive Packaging SUCCESS [0.123s] [INFO] [INFO] BUILD SUCCESS 2013/11/18 Jie Jin hellojin...@gmail.com try mvn install -DskipTests first Best Regards Jin Jie Sent from my mobile device. On Nov 18, 2013 8:08 PM, Meng QingPing mqingp...@gmail.com wrote: Hi, I want to inetegrate Hive with HBase 0.96 and Hadoop 2.2. I found Hive 0.13 support them. So I checkout the 0.13 snapshot just now, but get bellow error when build. All sub components built success, but fail at Hive task. Can someone help resolve it? Thanks. [mqingping@LDEV-D042 hive]$ mvn clean -e package assembly:assembly -DskipTests .. [INFO] org/apache/hadoop/hive/shims/Hadoop23Shims$2.class already added, skipping [INFO] org/apache/hadoop/hive/shims/Jetty23Shims$1.class already added, skipping [INFO] org/apache/hadoop/mapred/WebHCatJTShim23.class already added, skipping [INFO] META-INF/maven/org.apache.hive.shims/hive-shims-0.23/pom.xml already added, skipping [INFO] META-INF/maven/org.apache.hive.shims/hive-shims-0.23/pom.properties already added, skipping [INFO] [INFO] Reactor Summary: [INFO] [INFO] Hive .. FAILURE [1.170s] [INFO] Hive Ant Utilities SUCCESS [1.857s] [INFO] Hive Shims Common . SUCCESS [0.577s] [INFO] Hive Shims 0.20 ... SUCCESS [0.353s] [INFO] Hive Shims Secure Common .. SUCCESS [0.582s] [INFO] Hive Shims 0.20S .. SUCCESS [0.700s] [INFO] Hive Shims 0.23 ... SUCCESS [0.615s] [INFO] Hive Shims SUCCESS [1.709s] [INFO] Hive Common ... SUCCESS [3.335s] [INFO] Hive Serde SUCCESS [2.588s] [INFO] Hive Metastore SUCCESS [8.542s] [INFO] Hive Query Language ... SUCCESS [17.326s] [INFO] Hive Service .. SUCCESS [1.511s] [INFO] Hive JDBC . SUCCESS [0.306s] [INFO] Hive Beeline .. SUCCESS [0.283s]
Re: Review Request 15663: Hive should be able to skip header and footer rows when reading data file for a table
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15663/#review29093 --- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java https://reviews.apache.org/r/15663/#comment56220 What does this mean exactly? Is this lines of footer or actual total number of footers? If it is number of footers, should say max number of footers ... itests/qtest/pom.xml https://reviews.apache.org/r/15663/#comment56221 is this really supposed to be in the patch? ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java https://reviews.apache.org/r/15663/#comment56228 Please put a paragraph of explanation of the header/footer skipping feature right in the code. Including what it is and how to use it. Also, please create web documentation for the new feature. Check with Lefty L. about where to put it. You could start by putting a first draft under https://cwiki.apache.org/confluence/display/Hive/DesignDocs. You could delete the design doc from there once the design becomes part of the Hive documentation. ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java https://reviews.apache.org/r/15663/#comment56222 Hive coding style guidelines say to put a blank line before all comments. Please check all your comments for this. ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java https://reviews.apache.org/r/15663/#comment56227 I recommend using skip.header.line.count instead of skip.header.number to make it explicit that you are skipping lines. Also, use skip.footer.line.count instead of skip.footer.number. ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java https://reviews.apache.org/r/15663/#comment56223 put blank after // before first word ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java https://reviews.apache.org/r/15663/#comment56225 Please put a comment before the is class explaining what it is for. ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java https://reviews.apache.org/r/15663/#comment56224 use camel case (footerCur) ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java https://reviews.apache.org/r/15663/#comment56226 Please run checkstyle. E.g. there should be a blank between ){ - Eric Hanson On Nov. 19, 2013, 1:31 a.m., Eric Hanson wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15663/ --- (Updated Nov. 19, 2013, 1:31 a.m.) Review request for hive and Thejas Nair. Bugs: HIVE-5795 https://issues.apache.org/jira/browse/HIVE-5795 Repository: hive-git Description --- Hive should be able to skip header and footer rows when reading data file for a table (I am uploading this on behalf of Shuaishuai Nie since he's not in the office) Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 32ab3d8 data/files/header_footer_table_1/0001.txt PRE-CREATION data/files/header_footer_table_1/0002.txt PRE-CREATION data/files/header_footer_table_1/0003.txt PRE-CREATION data/files/header_footer_table_2/2012/01/01/0001.txt PRE-CREATION data/files/header_footer_table_2/2012/01/02/0002.txt PRE-CREATION data/files/header_footer_table_2/2012/01/03/0003.txt PRE-CREATION itests/qtest/pom.xml a453d8a ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 5abcfc1 ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java dd5cb6b ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 0ec6e63 ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java 85dd975 ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 0686d9b ql/src/test/queries/clientpositive/file_with_header_footer.q PRE-CREATION ql/src/test/results/clientpositive/file_with_header_footer.q.out PRE-CREATION Diff: https://reviews.apache.org/r/15663/diff/ Testing --- Thanks, Eric Hanson
[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table
[ https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826075#comment-13826075 ] Eric Hanson commented on HIVE-5795: --- Shuaishuai -- please see my comments on ReviewBoard Hive should be able to skip header and footer rows when reading data file for a table - Key: HIVE-5795 URL: https://issues.apache.org/jira/browse/HIVE-5795 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5795.1.patch Hive should be able to skip header and footer lines when reading data file from table. In this way, user don't need to processing data which generated by other application with a header or footer and directly use the file for table operations. To implement this, the idea is adding new properties in table descriptions to define the number of lines in header and footer and skip them when reading the record from record reader. An DDL example for creating a table with header and footer should be like this: {code} Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties (skip.header.number=1, skip.footer.number=2); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5635) WebHCatJTShim23 ignores security/user context
[ https://issues.apache.org/jira/browse/HIVE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-5635: - Attachment: HIVE-5635.3.patch WebHCatJTShim23 ignores security/user context - Key: HIVE-5635 URL: https://issues.apache.org/jira/browse/HIVE-5635 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-5635.2.patch, HIVE-5635.3.patch, HIVE-5635.patch WebHCatJTShim23 takes UserGroupInformation object as argument (which represents the user make the call to WebHCat or doAs user) but ignores. WebHCatJTShim20S uses the UserGroupInformation This is inconsistent and may be a security hole because in with Hadoop 2 the methods on WebHCatJTShim are likely running with 'hcat' as the user context. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table
[ https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826076#comment-13826076 ] Brock Noland commented on HIVE-5795: The patch has ArrayList on the LHS. It should be List or Collection. Hive should be able to skip header and footer rows when reading data file for a table - Key: HIVE-5795 URL: https://issues.apache.org/jira/browse/HIVE-5795 Project: Hive Issue Type: Bug Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-5795.1.patch Hive should be able to skip header and footer lines when reading data file from table. In this way, user don't need to processing data which generated by other application with a header or footer and directly use the file for table operations. To implement this, the idea is adding new properties in table descriptions to define the number of lines in header and footer and skip them when reading the record from record reader. An DDL example for creating a table with header and footer should be like this: {code} Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties (skip.header.number=1, skip.footer.number=2); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: [ANNOUNCE] New Hive Committer and PMC Member - Lefty Leverenz
Congratulations, Lefty! --Xuefu On Mon, Nov 18, 2013 at 3:52 PM, Vikram Dixit vik...@hortonworks.comwrote: Congrats Lefty! This is awesome. On Sun, Nov 17, 2013 at 7:53 AM, Jarek Jarcec Cecho jar...@apache.org wrote: Congratulations Lefty! Jarcec On Sat, Nov 16, 2013 at 09:20:00PM -0800, Carl Steinbach wrote: The Apache Hive PMC has voted to make Lefty Leverenz a committer and PMC member on the Apache Hive Project. Please join me in congratulating Lefty! Thanks. Carl -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-5844) dynamic_partition_skip_default.q test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826100#comment-13826100 ] Harish Butani commented on HIVE-5844: - First of all; I am really sorry, I missed your .10.patch But with this patch I get a diff too. I tested on Mac OSX. dynamic_partition_skip_default.q test fails on trunk Key: HIVE-5844 URL: https://issues.apache.org/jira/browse/HIVE-5844 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Trivial Attachments: HIVE-5844.1.patch.txt HIVE-5369 changes explain extended output to add statistics information. This breaks dynamic_partition_skip_default.q file on trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5844) dynamic_partition_skip_default.q test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5844: - Attachment: HIVE-5844.2.patch.txt Refreshed the trunk again and regenerated the golden file. dynamic_partition_skip_default.q test fails on trunk Key: HIVE-5844 URL: https://issues.apache.org/jira/browse/HIVE-5844 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Trivial Attachments: HIVE-5844.1.patch.txt, HIVE-5844.2.patch.txt HIVE-5369 changes explain extended output to add statistics information. This breaks dynamic_partition_skip_default.q file on trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5844) dynamic_partition_skip_default.q test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5844: - Status: Patch Available (was: Open) marking it as patch available for HIVE QA to run tests. dynamic_partition_skip_default.q test fails on trunk Key: HIVE-5844 URL: https://issues.apache.org/jira/browse/HIVE-5844 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Trivial Attachments: HIVE-5844.1.patch.txt, HIVE-5844.2.patch.txt HIVE-5369 changes explain extended output to add statistics information. This breaks dynamic_partition_skip_default.q file on trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5356) Move arithmatic UDFs to generic UDF implementations
[ https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826130#comment-13826130 ] Hive QA commented on HIVE-5356: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614514/HIVE-5356.12.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4665 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/348/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/348/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12614514 Move arithmatic UDFs to generic UDF implementations --- Key: HIVE-5356 URL: https://issues.apache.org/jira/browse/HIVE-5356 Project: Hive Issue Type: Task Components: UDF Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5356.1.patch, HIVE-5356.10.patch, HIVE-5356.11.patch, HIVE-5356.12.patch, HIVE-5356.2.patch, HIVE-5356.3.patch, HIVE-5356.4.patch, HIVE-5356.5.patch, HIVE-5356.6.patch, HIVE-5356.7.patch, HIVE-5356.8.patch, HIVE-5356.9.patch Currently, all of the arithmetic operators, such as add/sub/mult/div, are implemented as old-style UDFs and java reflection is used to determine the return type TypeInfos/ObjectInspectors, based on the return type of the evaluate() method chosen for the expression. This works fine for types that don't have type params. Hive decimal type participates in these operations just like int or double. Different from double or int, however, decimal has precision and scale, which cannot be determined by just looking at the return type (decimal) of the UDF evaluate() method, even though the operands have certain precision/scale. With the default of decimal without precision/scale, then (10, 0) will be the type params. This is certainly not desirable. To solve this problem, all of the arithmetic operators would need to be implemented as GenericUDFs, which allow returning ObjectInspector during the initialize() method. The object inspectors returned can carry type params, from which the exact return type can be determined. It's worth mentioning that, for user UDF implemented in non-generic way, if the return type of the chosen evaluate() method is decimal, the return type actually has (10,0) as precision/scale, which might not be desirable. This needs to be documented. This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit the scope of review. The remaining ones will be covered under HIVE-5706. -- This message was sent by Atlassian JIRA (v6.1#6144)
Review Request 15666: HIVE-5847 DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15666/ --- Review request for hive and Thejas Nair. Bugs: HIVE-5847 https://issues.apache.org/jira/browse/HIVE-5847 Repository: hive-git Description --- - getColumns(): column_size, decimal_digits, num_prec_radix should use the proper type info for char/varchar/decimal - getColumns(): column_size set to 29 for timestamp, to match JDBC ResultSetMetadata - getColumns() and ResultSetMetadata should return same scale for timestamp (9). - Changed radix to 10 for all numeric types; was previously set to 2 for float/double Diffs - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 7b1c9da jdbc/src/java/org/apache/hive/jdbc/JdbcColumn.java 42ec32a service/src/java/org/apache/hive/service/cli/Type.java 9329392 service/src/java/org/apache/hive/service/cli/TypeDescriptor.java fb0236c service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java af87a90 service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 2daa9cd Diff: https://reviews.apache.org/r/15666/diff/ Testing --- Thanks, Jason Dere
[jira] [Commented] (HIVE-5847) DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
[ https://issues.apache.org/jira/browse/HIVE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826139#comment-13826139 ] Jason Dere commented on HIVE-5847: -- https://reviews.apache.org/r/15666/ DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal --- Key: HIVE-5847 URL: https://issues.apache.org/jira/browse/HIVE-5847 Project: Hive Issue Type: Bug Components: JDBC Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5847.1.patch column_size, decimal_digits, num_prec_radix should be set appropriately based on the type qualifiers. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support
[ https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826137#comment-13826137 ] Thejas M Nair commented on HIVE-5317: - Ed, For the data re-processing use case, this approach is not what is recommended. This approach is meant to be used for use cases where your changes to a partition are small fraction of the existing number of rows. Even with this approach, it still would make sense to partition your data by time for 'fact tables'. Your dimension table has *new* records being added periodically, making it more like the 'fact table' use case. This approach will also work with tables partitioned by time. Implement insert, update, and delete in Hive with full ACID support --- Key: HIVE-5317 URL: https://issues.apache.org/jira/browse/HIVE-5317 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: InsertUpdatesinHive.pdf Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are: * INSERT INTO tbl SELECT … * INSERT INTO tbl VALUES ... * UPDATE tbl SET … WHERE … * DELETE FROM tbl WHERE … * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... * SET TRANSACTION LEVEL … * BEGIN/END TRANSACTION Use Cases * Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. * Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance. * Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5847) DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
[ https://issues.apache.org/jira/browse/HIVE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5847: - Attachment: HIVE-5847.1.patch patch v1: - getColumns(): column_size, decimal_digits, num_prec_radix should use the proper type info for char/varchar/decimal - getColumns(): column_size set to 29 for timestamp, to match JDBC ResultSetMetadata - getColumns() and ResultSetMetadata should return same scale for timestamp (9). - Changed radix to 10 for all numeric types; was previously set to 2 for float/double DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal --- Key: HIVE-5847 URL: https://issues.apache.org/jira/browse/HIVE-5847 Project: Hive Issue Type: Bug Components: JDBC Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5847.1.patch column_size, decimal_digits, num_prec_radix should be set appropriately based on the type qualifiers. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5847) DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal
[ https://issues.apache.org/jira/browse/HIVE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5847: - Status: Patch Available (was: Open) DatabaseMetadata.getColumns() doesn't show correct column size for char/varchar/decimal --- Key: HIVE-5847 URL: https://issues.apache.org/jira/browse/HIVE-5847 Project: Hive Issue Type: Bug Components: JDBC Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-5847.1.patch column_size, decimal_digits, num_prec_radix should be set appropriately based on the type qualifiers. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5356) Move arithmatic UDFs to generic UDF implementations
[ https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826140#comment-13826140 ] Xuefu Zhang commented on HIVE-5356: --- The above test failure was due to HIVE-5844. Move arithmatic UDFs to generic UDF implementations --- Key: HIVE-5356 URL: https://issues.apache.org/jira/browse/HIVE-5356 Project: Hive Issue Type: Task Components: UDF Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5356.1.patch, HIVE-5356.10.patch, HIVE-5356.11.patch, HIVE-5356.12.patch, HIVE-5356.2.patch, HIVE-5356.3.patch, HIVE-5356.4.patch, HIVE-5356.5.patch, HIVE-5356.6.patch, HIVE-5356.7.patch, HIVE-5356.8.patch, HIVE-5356.9.patch Currently, all of the arithmetic operators, such as add/sub/mult/div, are implemented as old-style UDFs and java reflection is used to determine the return type TypeInfos/ObjectInspectors, based on the return type of the evaluate() method chosen for the expression. This works fine for types that don't have type params. Hive decimal type participates in these operations just like int or double. Different from double or int, however, decimal has precision and scale, which cannot be determined by just looking at the return type (decimal) of the UDF evaluate() method, even though the operands have certain precision/scale. With the default of decimal without precision/scale, then (10, 0) will be the type params. This is certainly not desirable. To solve this problem, all of the arithmetic operators would need to be implemented as GenericUDFs, which allow returning ObjectInspector during the initialize() method. The object inspectors returned can carry type params, from which the exact return type can be determined. It's worth mentioning that, for user UDF implemented in non-generic way, if the return type of the chosen evaluate() method is decimal, the return type actually has (10,0) as precision/scale, which might not be desirable. This needs to be documented. This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit the scope of review. The remaining ones will be covered under HIVE-5706. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5849) Improve the stats of operators based on heuristics in the absence of any column statistics
Prasanth J created HIVE-5849: Summary: Improve the stats of operators based on heuristics in the absence of any column statistics Key: HIVE-5849 URL: https://issues.apache.org/jira/browse/HIVE-5849 Project: Hive Issue Type: Sub-task Reporter: Prasanth J Assignee: Prasanth J In the absence of any column statistics, operators will simply use the statistics from its parents. It is useful to apply some heuristics to update basic statistics (number of rows and data size) in the absence of any column statistics. This will be worst case scenario. -- This message was sent by Atlassian JIRA (v6.1#6144)
Hive-trunk-h0.21 - Build # 2458 - Still Failing
Changes for Build #2392 [brock] HIVE-5445 - PTest2 should use testonly target [hashutosh] HIVE-5490 : SUBSTR(col, 1, 0) returns wrong result in vectorized mode (Teddy Choi via Ashutosh Chauhan) [hashutosh] HIVE-4846 : Implement Vectorized Limit Operator (Sarvesh Sakalanaga via Ashutosh Chauhan) Changes for Build #2393 Changes for Build #2394 [hashutosh] HIVE-5494 : Vectorization throws exception with nested UDF. (Jitendra Nath Pandey via Ashutosh Chauhan) Changes for Build #2395 [brock] HIVE-5513 - Set the short version directly via build script (Prasad Mujumdar via Brock Noland) [brock] HIVE-5252 - Add ql syntax for inline java code creation (Edward Capriolo via Brock Noland) Changes for Build #2396 Changes for Build #2397 [hashutosh] HIVE-5512 : metastore filter pushdown should support between (Sergey Shelukhin via Ashutosh Chauhan) Changes for Build #2398 [hashutosh] HIVE-5479 : SBAP restricts hcat -e show databases (Sushanth Sowmyan via Ashutosh Chauhan) [hashutosh] HIVE-5485 : SBAP errors on null partition being passed into partition level authorization (Sushanth Sowmyan via Ashutosh Chauhan) [hashutosh] HIVE-5496 : hcat -e drop database if exists fails on authorizing non-existent null db (Sushanth Sowmyan via Ashutosh Chauhan) [hashutosh] HIVE-5474 : drop table hangs when concurrency=true (Jason Dere via Ashutosh Chauhan) Changes for Build #2399 [hashutosh] HIVE-5220 : Use factory methods to instantiate HiveDecimal instead of constructors (Xuefu Zhang via Ashutosh Chauhan) Changes for Build #2400 Changes for Build #2401 [ecapriolo] An explode function that includes the item's position in the array (Niko Stahl via egc) [brock] HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock Noland) [thejas] HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel Dai via Thejas Nair) [thejas] HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 (Daniel Dai via Thejas Nair) [brock] HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu via Brock Noland) [thejas] HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid (Daniel Dai via Thejas Nair) [thejas] HIVE-5507: [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness (Daniel Dai via Thejas Nair) [daijy] HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed (Thejas M Nair via Daniel Dai) [daijy] HIVE-5453 : jobsubmission2.conf should use 'timeout' property (Eugene Koifman via Daniel Dai) Changes for Build #2402 Changes for Build #2403 [thejas] HIVE-5531: Hiverserver2 doesn't honor command line argument when initializing log4j (Shuaishuai Nie via Thejas Nair) [hashutosh] HIVE-4821 : Implement vectorized type casting for all types (Eric Hanson via Ashutosh Chauhan) [brock] HIVE-5492 - Explain query fails with NPE if a client doesn't call getResultSetSchema() (Xuefu Zhang via Brock Noland) Changes for Build #2404 [hashutosh] HIVE-5546 : A change in ORCInputFormat made by HIVE4113 was reverted by HIVE5391 (Yin Huai via Ashutosh Chauhan) Changes for Build #2405 [brock] HIVE-5435 - Milestone 5: PTest2 maven support Changes for Build #2406 [thejas] Updating release notes with 0.12 release [hashutosh] HIVE-5517 : Implement end-to-end tests for vectorized string and math functions, and casts (Eric Hanson via Ashutosh Chauhan) Changes for Build #2407 [hashutosh] HIVE-4850 : Implement vectorized JOIN operators (Remus Rusanu via Ashutosh Chauhan) [brock] HIVE-5575: ZooKeeper connection closed when unlock with retry (Chun Chen via Brock Noland) [brock] HIVE-5548: Tests under common directory don't run as part of 'ant test' (Xuefu Zhang via Brock Noland) [gunther] HIVE-5525: Vectorized query failing for partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner) Changes for Build #2408 [daijy] HIVE-5133: webhcat jobs that need to access metastore fails in secure mode (Eugene Koifman via Daniel Dai) Changes for Build #2409 Changes for Build #2410 Changes for Build #2411 [hashutosh] HIVE-5411 : Migrate expression serialization to Kryo (Ashutosh Chauhan via Thejas Nair) Changes for Build #2412 [brock] HIVE-5578 - hcat script doesn't include jars from HIVE_AUX_JARS_PATH (Mohammad Kamrul Islam via Brock Noland) [brock] HIVE-5070 - Implement listLocatedStatus() in ProxyFileSystem for 0.23 shim (shanyu zhao via Brock Noland) [hashutosh] HIVE-5574 : Unnecessary newline at the end of message of ParserException (Navis via Ashutosh Chauhan) [navis] HIVE-5572 : Fails of non-sql command are not propagated to jdbc2 client (Navis reviewed by Brock Noland) [hashutosh] HIVE-5559 : Stats publisher fails for list bucketing when IDs are too long (Jason Dere via Ashutosh Chauhan) Changes for Build #2413 [brock] HIVE-5132 - Can't access to hwi due to 'No Java compiler available' (Bing Li via Edward Capriolo) [brock] HIVE-4957 - Restrict
[jira] [Assigned] (HIVE-5805) Support for operators like PTF, Script, Extract etc. in statistics annotation.
[ https://issues.apache.org/jira/browse/HIVE-5805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J reassigned HIVE-5805: Assignee: Prasanth J Support for operators like PTF, Script, Extract etc. in statistics annotation. -- Key: HIVE-5805 URL: https://issues.apache.org/jira/browse/HIVE-5805 Project: Hive Issue Type: Sub-task Components: Query Processor, Statistics Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.13.0 Statistics annotation (HIVE-5369) only supports table scan, select, filter, limit, union, groupby, join. This sub task is to add support for remaining operators. -- This message was sent by Atlassian JIRA (v6.1#6144)
Hive-trunk-hadoop2 - Build # 558 - Still Failing
Changes for Build #495 [brock] HIVE-5445 - PTest2 should use testonly target [hashutosh] HIVE-5490 : SUBSTR(col, 1, 0) returns wrong result in vectorized mode (Teddy Choi via Ashutosh Chauhan) [hashutosh] HIVE-4846 : Implement Vectorized Limit Operator (Sarvesh Sakalanaga via Ashutosh Chauhan) Changes for Build #496 Changes for Build #497 [hashutosh] HIVE-5494 : Vectorization throws exception with nested UDF. (Jitendra Nath Pandey via Ashutosh Chauhan) Changes for Build #498 [hashutosh] HIVE-5512 : metastore filter pushdown should support between (Sergey Shelukhin via Ashutosh Chauhan) [brock] HIVE-5513 - Set the short version directly via build script (Prasad Mujumdar via Brock Noland) [brock] HIVE-5252 - Add ql syntax for inline java code creation (Edward Capriolo via Brock Noland) Changes for Build #499 Changes for Build #500 [hashutosh] HIVE-5479 : SBAP restricts hcat -e show databases (Sushanth Sowmyan via Ashutosh Chauhan) [hashutosh] HIVE-5485 : SBAP errors on null partition being passed into partition level authorization (Sushanth Sowmyan via Ashutosh Chauhan) [hashutosh] HIVE-5496 : hcat -e drop database if exists fails on authorizing non-existent null db (Sushanth Sowmyan via Ashutosh Chauhan) [hashutosh] HIVE-5474 : drop table hangs when concurrency=true (Jason Dere via Ashutosh Chauhan) Changes for Build #501 [hashutosh] HIVE-5520 : Use factory methods to instantiate HiveDecimal instead of constructors (Xuefu Zhang via Ashutosh Chauhan) Changes for Build #502 [ecapriolo] An explode function that includes the item's position in the array (Niko Stahl via egc) [brock] HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock Noland) [thejas] HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel Dai via Thejas Nair) [thejas] HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 (Daniel Dai via Thejas Nair) [brock] HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu via Brock Noland) [thejas] HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid (Daniel Dai via Thejas Nair) [thejas] HIVE-5507: [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness (Daniel Dai via Thejas Nair) [daijy] HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed (Thejas M Nair via Daniel Dai) [daijy] HIVE-5453 : jobsubmission2.conf should use 'timeout' property (Eugene Koifman via Daniel Dai) Changes for Build #503 Changes for Build #504 [brock] HIVE-5492 - Explain query fails with NPE if a client doesn't call getResultSetSchema() (Xuefu Zhang via Brock Noland) Changes for Build #505 [hashutosh] HIVE-4821 : Implement vectorized type casting for all types (Eric Hanson via Ashutosh Chauhan) Changes for Build #506 [thejas] HIVE-5531: Hiverserver2 doesn't honor command line argument when initializing log4j (Shuaishuai Nie via Thejas Nair) Changes for Build #507 [hashutosh] HIVE-5546 : A change in ORCInputFormat made by HIVE4113 was reverted by HIVE5391 (Yin Huai via Ashutosh Chauhan) Changes for Build #508 [brock] HIVE-5435 - Milestone 5: PTest2 maven support Changes for Build #509 [thejas] Updating release notes with 0.12 release [hashutosh] HIVE-5517 : Implement end-to-end tests for vectorized string and math functions, and casts (Eric Hanson via Ashutosh Chauhan) Changes for Build #510 [hashutosh] HIVE-4850 : Implement vectorized JOIN operators (Remus Rusanu via Ashutosh Chauhan) [brock] HIVE-5575: ZooKeeper connection closed when unlock with retry (Chun Chen via Brock Noland) [brock] HIVE-5548: Tests under common directory don't run as part of 'ant test' (Xuefu Zhang via Brock Noland) [gunther] HIVE-5525: Vectorized query failing for partitioned tables. (Jitendra Nath Pandey via Gunther Hagleitner) Changes for Build #511 [daijy] HIVE-5133: webhcat jobs that need to access metastore fails in secure mode (Eugene Koifman via Daniel Dai) Changes for Build #512 Changes for Build #513 Changes for Build #514 [navis] HIVE-5572 : Fails of non-sql command are not propagated to jdbc2 client (Navis reviewed by Brock Noland) [hashutosh] HIVE-5559 : Stats publisher fails for list bucketing when IDs are too long (Jason Dere via Ashutosh Chauhan) [hashutosh] HIVE-5411 : Migrate expression serialization to Kryo (Ashutosh Chauhan via Thejas Nair) Changes for Build #515 [brock] HIVE-5132 - Can't access to hwi due to 'No Java compiler available' (Bing Li via Edward Capriolo) [brock] HIVE-4957 - Restrict number of bit vectors, to prevent out of Java heap memory (Shreepadma Venugopalan via Brock Noland) [brock] HIVE-5578 - hcat script doesn't include jars from HIVE_AUX_JARS_PATH (Mohammad Kamrul Islam via Brock Noland) [brock] HIVE-5070 - Implement listLocatedStatus() in ProxyFileSystem for 0.23 shim (shanyu zhao via Brock Noland) [hashutosh] HIVE-5574 : Unnecessary newline at the end of message
[jira] [Commented] (HIVE-5356) Move arithmatic UDFs to generic UDF implementations
[ https://issues.apache.org/jira/browse/HIVE-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826159#comment-13826159 ] Brock Noland commented on HIVE-5356: +1 Move arithmatic UDFs to generic UDF implementations --- Key: HIVE-5356 URL: https://issues.apache.org/jira/browse/HIVE-5356 Project: Hive Issue Type: Task Components: UDF Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.13.0 Attachments: HIVE-5356.1.patch, HIVE-5356.10.patch, HIVE-5356.11.patch, HIVE-5356.12.patch, HIVE-5356.2.patch, HIVE-5356.3.patch, HIVE-5356.4.patch, HIVE-5356.5.patch, HIVE-5356.6.patch, HIVE-5356.7.patch, HIVE-5356.8.patch, HIVE-5356.9.patch Currently, all of the arithmetic operators, such as add/sub/mult/div, are implemented as old-style UDFs and java reflection is used to determine the return type TypeInfos/ObjectInspectors, based on the return type of the evaluate() method chosen for the expression. This works fine for types that don't have type params. Hive decimal type participates in these operations just like int or double. Different from double or int, however, decimal has precision and scale, which cannot be determined by just looking at the return type (decimal) of the UDF evaluate() method, even though the operands have certain precision/scale. With the default of decimal without precision/scale, then (10, 0) will be the type params. This is certainly not desirable. To solve this problem, all of the arithmetic operators would need to be implemented as GenericUDFs, which allow returning ObjectInspector during the initialize() method. The object inspectors returned can carry type params, from which the exact return type can be determined. It's worth mentioning that, for user UDF implemented in non-generic way, if the return type of the chosen evaluate() method is decimal, the return type actually has (10,0) as precision/scale, which might not be desirable. This needs to be documented. This JIRA will cover minus, plus, divide, multiply, mod, and pmod, to limit the scope of review. The remaining ones will be covered under HIVE-5706. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5829) Rewrite Trim and Pad UDFs based on GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826167#comment-13826167 ] Hive QA commented on HIVE-5829: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12614483/HIVE-5829.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4622 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/349/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/349/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12614483 Rewrite Trim and Pad UDFs based on GenericUDF - Key: HIVE-5829 URL: https://issues.apache.org/jira/browse/HIVE-5829 Project: Hive Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-5829.1.patch This JIRA includes following UDFs: 1. trim() 2. ltrim() 3. rtrim() 4. lpad() 5. rpad() -- This message was sent by Atlassian JIRA (v6.1#6144)