[jira] [Updated] (HIVE-4645) Stat information like numFiles and totalSize is not correct when sub-directory is exists
[ https://issues.apache.org/jira/browse/HIVE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4645: -- Attachment: HIVE-4645.D11037.1.patch navis requested code review of HIVE-4645 [jira] Stat information like numFiles and totalSize is not correct when sub-directory is exists. Reviewers: JIRA HIVE-4645 Stat information like numFiles and totalSize is not correct when sub-directory is exists The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's size of parent directory, not sum of file size. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D11037 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java ql/src/test/results/clientpositive/infer_bucket_sort_list_bucket.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/26361/ To: JIRA, navis Stat information like numFiles and totalSize is not correct when sub-directory is exists Key: HIVE-4645 URL: https://issues.apache.org/jira/browse/HIVE-4645 Project: Hive Issue Type: Test Components: Statistics Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-4645.D11037.1.patch The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's size of parent directory, not sum of file size. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-4620: MR temp directory conflicts in case of parallel execution mode
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11464/ --- (Updated June 3, 2013, 6:17 a.m.) Review request for hive, Ashutosh Chauhan and Navis Ryu. Changes --- Updated patch per review comments - renamed taskID to taskRunnerID - removed extra call to set the pre-thread runner id. It's already handled in the thread local's overloaded initialValue(). Description --- MR temp directory conflicts in case of parallel execution mode MR temp directory conflicts in Hive parallel execution mode. Patch includes adding a per thread task counter to the MR scratch directory path set by hive. This addresses bug HIVE-4620. https://issues.apache.org/jira/browse/HIVE-4620 Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/Context.java 6466275 ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java 56c2be6 Diff: https://reviews.apache.org/r/11464/diff/ Testing --- Manual testing, full unit test run. Thanks, Prasad Mujumdar
[jira] [Updated] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode
[ https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-4620: -- Attachment: HIVE-4620-3.patch [~navis] Thanks for the comments. The original review request on https://reviews.apache.org/r/11464/ is updated with the new patch. MR temp directory conflicts in case of parallel execution mode -- Key: HIVE-4620 URL: https://issues.apache.org/jira/browse/HIVE-4620 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch In parallel query execution mode, all the parallel running task ends up sharing the same temp/scratch directory. This could lead to file conflicts and temp files getting deleted before the job completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4646) skewjoin.q is failing in hadoop2
Navis created HIVE-4646: --- Summary: skewjoin.q is failing in hadoop2 Key: HIVE-4646 URL: https://issues.apache.org/jira/browse/HIVE-4646 Project: Hive Issue Type: Test Components: Query Processor Reporter: Navis Assignee: Navis https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception instead of returning null for not-existing path. But skew resolver depends on old behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4646) skewjoin.q is failing in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4646: -- Attachment: HIVE-4646.D11043.1.patch navis requested code review of HIVE-4646 [jira] skewjoin.q is failing in hadoop2. Reviewers: JIRA HIVE-4646 skewjoin.q is failing in hadoop2 https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception instead of returning null for not-existing path. But skew resolver depends on old behavior. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D11043 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/26367/ To: JIRA, navis skewjoin.q is failing in hadoop2 Key: HIVE-4646 URL: https://issues.apache.org/jira/browse/HIVE-4646 Project: Hive Issue Type: Test Components: Query Processor Reporter: Navis Assignee: Navis Attachments: HIVE-4646.D11043.1.patch https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception instead of returning null for not-existing path. But skew resolver depends on old behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3949) Some test failures in hadoop 23
[ https://issues.apache.org/jira/browse/HIVE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13672883#comment-13672883 ] Navis commented on HIVE-3949: - I'm looking into tests in TestCliDriver. Currently, {noformat} archive_excludeHadoop20.q archive_multi.q auto_join14.q : update result (changed default) combine2.q ctas_colname.q : non-deterministic groupby_grouping_sets4.q: non-deterministic infer_bucket_sort_list_bucket.q : HIVE-4645 input12.q : update result (added input hook) input39.q : update result (added input hook) join32_lessSize.q : non-deterministic join_1to1.q join_vc.q : HIVE-4626 list_bucket_query_oneskew_1.q : non-deterministic list_bucket_query_oneskew_2.q : non-deterministic list_bucket_query_oneskew_3.q : non-deterministic multi_insert_lateral_view.q : non-deterministic orc_diff_part_cols.q: non-deterministic ptf_npath.q recursive_dir.q : update result (added input hook) sample_islocalmode_hook.q : update result (added input hook) skewjoin.q : HIVE-4646 skewjoin_union_remove_1.q : update result (seemed not applied HIVE-948) skewjoin_union_remove_2.q : update result (seemed not applied HIVE-948) stats_partscan_1.q truncate_column.q : non-deterministic truncate_column_merge.q : non-deterministic udaf_percentile_approx.q {noformat} Some test failures in hadoop 23 --- Key: HIVE-3949 URL: https://issues.apache.org/jira/browse/HIVE-3949 Project: Hive Issue Type: Bug Reporter: Gang Tim Liu Assignee: Gang Tim Liu This is follow up on hive-3873. We have fixed some test failures in 3873 and a few other jira issues. We will use this jira to track the rest failures: https://builds.apache.org/job/Hive-trunk-hadoop2/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2615) CTAS with literal NULL creates VOID type
[ https://issues.apache.org/jira/browse/HIVE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhuoluo (Clark) Yang updated HIVE-2615: --- Assignee: Zhuoluo (Clark) Yang CTAS with literal NULL creates VOID type Key: HIVE-2615 URL: https://issues.apache.org/jira/browse/HIVE-2615 Project: Hive Issue Type: Bug Reporter: David Phillips Assignee: Zhuoluo (Clark) Yang Create the table with a column that always contains NULL: {quote} hive create table bad as select 1 x, null z from dual; {quote} Because there's no type, Hive gives it the VOID type: {quote} hive describe bad; OK x int z void {quote} This seems weird, because AFAIK, there is no normal way to create a column of type VOID. The problem is that the table can't be queried: {quote} hive select * from bad; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Internal error: no LazyObject for VOID {quote} Worse, even if you don't select that field, the query fails at runtime: {quote} hive select x from bad; ... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-4546: Hive CLI leaves behind the per session resource directory on non-interactive invocation
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11083/ --- (Updated June 3, 2013, 8:29 a.m.) Review request for hive, Owen O'Malley and Gunther Hagleitner. Changes --- Thanks for the review comments. Updated patch with better error handling for CliDriver.run() Regarding session id, its not much of a readable format it is (userid + vmName + timestamp) + proposed counter. We can still run into edge conditions with multiple hive CLIs or multiple hive server (eg. for HA purpose) on different node. Using UUID as handle takes care such cases. Description --- Hive CLI leaves behind the per session resource directory on non-interactive invocation. The patch includes executing session state close() at the end of non-interactive invocation. Also changed the session id format to be a UUID. This is avoid possible resource directory path conflict when there are multiple session HiveServer2 from same user at same time. This addresses bug HIVE-4546. https://issues.apache.org/jira/browse/HIVE-4546 Diffs (updated) - cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 4239392 ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 8e6e24a Diff: https://reviews.apache.org/r/11083/diff/ Testing --- Thanks, Prasad Mujumdar
[jira] [Updated] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation
[ https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-4546: -- Status: Patch Available (was: Open) Thanks Ashutosh! Responded to review comments and Updated patch. Hive CLI leaves behind the per session resource directory on non-interactive invocation --- Key: HIVE-4546 URL: https://issues.apache.org/jira/browse/HIVE-4546 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch As part of HIVE-4505, the resource directory is set to /tmp/${hive.session.id}_resources and suppose to be removed at the end. The CLI fails to remove it when invoked using -f or -e (non-interactive mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation
[ https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-4546: -- Attachment: HIVE-4546-2.patch Hive CLI leaves behind the per session resource directory on non-interactive invocation --- Key: HIVE-4546 URL: https://issues.apache.org/jira/browse/HIVE-4546 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch As part of HIVE-4505, the resource directory is set to /tmp/${hive.session.id}_resources and suppose to be removed at the end. The CLI fails to remove it when invoked using -f or -e (non-interactive mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4647) RetryingHMSHandler logs too many error messages
Navis created HIVE-4647: --- Summary: RetryingHMSHandler logs too many error messages Key: HIVE-4647 URL: https://issues.apache.org/jira/browse/HIVE-4647 Project: Hive Issue Type: Improvement Reporter: Navis Assignee: Navis Priority: Trivial NoSuchObjectException on invocation of methods like getTable/getPartition need not to be logged because it might be normal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
0.9.1 branch.
Could someone point me to hive 0.9.1 branch? Thanks in advance. Regards
error in running the hive test cases
Hi, When I run the hive test case, I keep getting the following error: [echo] Project: serde [javac] Compiling 36 source files to /home/john/dev/hive-0.9.0-Intel/src/build/serde/test/classes [javac] TestAvroSerdeUtils.java:24: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: package org.apache.hadoop.hdfs [javac] import org.apache.hadoop.hdfs.MiniDFSCluster; [javac] ^ [javac] TestAvroSerdeUtils.java:184: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: class org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils [javac] MiniDFSCluster miniDfs = null; [javac] ^ [javac] TestAvroSerdeUtils.java:187: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: class org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils [javac] miniDfs = new MiniDFSCluster(new Configuration(), 1, true, null); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. I am building hive 0.9 and running the test using ant package test. Could you help. Thanks
Re: 0.9.1 branch.
The 0.9.x branch is branch-0.9. There isn't a 0.9.1 release and it is very unlikely. http://svn.apache.org/repos/asf/hive/ -- Owen On Jun 3, 2013, at 11:51, ur lops urlop...@gmail.com wrote: Could someone point me to hive 0.9.1 branch? Thanks in advance. Regards
Re: 0.9.1 branch.
Thanks Owen for the quick response. I am looking for hive-895, which claims that it is merged in 0.9.1. ( https://issues.apache.org/jira/browse/HIVE-895 ) How to get that particular commit? Regards On Mon, Jun 3, 2013 at 2:59 AM, Owen O'Malley owen.omal...@gmail.com wrote: The 0.9.x branch is branch-0.9. There isn't a 0.9.1 release and it is very unlikely. http://svn.apache.org/repos/asf/hive/ -- Owen On Jun 3, 2013, at 11:51, ur lops urlop...@gmail.com wrote: Could someone point me to hive 0.9.1 branch? Thanks in advance. Regards
Re: 0.9.1 branch.
https://github.com/apache/hive/commit/e42ec89b31ae056e51d8db25d4ecc1a8a51212e0 On Mon, Jun 3, 2013 at 12:10 PM, ur lops urlop...@gmail.com wrote: Thanks Owen for the quick response. I am looking for hive-895, which claims that it is merged in 0.9.1. ( https://issues.apache.org/jira/browse/HIVE-895 ) How to get that particular commit? Regards On Mon, Jun 3, 2013 at 2:59 AM, Owen O'Malley owen.omal...@gmail.com wrote: The 0.9.x branch is branch-0.9. There isn't a 0.9.1 release and it is very unlikely. http://svn.apache.org/repos/asf/hive/ -- Owen On Jun 3, 2013, at 11:51, ur lops urlop...@gmail.com wrote: Could someone point me to hive 0.9.1 branch? Thanks in advance. Regards
[jira] [Updated] (HIVE-4612) Vectorized aggregates do not emit proper rows in presence of GROUP BY
[ https://issues.apache.org/jira/browse/HIVE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4612: --- Attachment: HIVE-4612.1.patch.txt Add support for all types Vectorized aggregates do not emit proper rows in presence of GROUP BY - Key: HIVE-4612 URL: https://issues.apache.org/jira/browse/HIVE-4612 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: vectorization-branch Attachments: HIVE-4612.0.patch.txt, HIVE-4612.1.patch.txt I discovered this while testing the fix for HIVE-4451 and HIVE-4452. The VGBy is emitting appropriate number of rows, but the row-mode ReduceSinkOperatoir only logs one row and the final result is incomplete. Investigating. Related to HIVE-4599. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-4612 Fix vector aggregates int type key output
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11427/ --- (Updated June 3, 2013, 2:08 p.m.) Review request for hive. Changes --- Added support for all current supported types (tinyint, smallint, int, bigint, boolean, timestamp, string, float, double) Description --- The VectorHashKeyValue output for int key type was broken, the M/R expects the type emitted to match the type reduced. By using a BinaryWriter with a LongWritable instead of a IntWritable the value was effectively corrupted. This addresses bug HIVE-4612. https://issues.apache.org/jira/browse/HIVE-4612 Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/VectorHashKeyWrapperBatch.java cd57151 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/TimestampUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 91366dd ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorReduceSinkOperator.java f61fcb6 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 6bb5618 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java ffd7ef2 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java aeff313 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFAvgDouble.java 54102a4 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFAvgLong.java 8c6844b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdPopDouble.java a4084b0 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdPopLong.java 28fdb36 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdSampDouble.java 4fa52ff ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdSampLong.java 551ae8a ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFSumDouble.java a2e8fb3 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFSumLong.java 71b2e3d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarPopDouble.java 2dfbfa3 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarPopLong.java de4811d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarSampDouble.java 5a21f44 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarSampLong.java 7b88c4f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFAvg.txt d85346d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFVar.txt daae57b ql/src/test/org/apache/hadoop/hive/ql/exec/vector/FakeVectorRowBatchFromObjectIterables.java 6824ee7 ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorGroupByOperator.java 6fc230f Diff: https://reviews.apache.org/r/11427/diff/ Testing --- manual test query Thanks, Remus Rusanu
[jira] [Updated] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[ https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4403: --- Affects Version/s: 0.11.0 Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Mark Grover Assignee: Chu Tong Fix For: 0.12.0 Attachments: HIVE-4403.patch, HIVE-4403.patch While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. This was on a pseudo distributed cluster. FWIW, I didn't see this happen on a fully-distributed cluster. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[ https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-4403. Resolution: Fixed Fix Version/s: 0.12.0 Committed to trunk. Thanks, Chu! Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Chu Tong Fix For: 0.12.0 Attachments: HIVE-4403.patch, HIVE-4403.patch While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. This was on a pseudo distributed cluster. FWIW, I didn't see this happen on a fully-distributed cluster. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[ https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673166#comment-13673166 ] Chu Tong commented on HIVE-4403: no problem, thank you for reviewing it [~ashutoshgupt...@gmail.com] Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Mark Grover Assignee: Chu Tong Fix For: 0.12.0 Attachments: HIVE-4403.patch, HIVE-4403.patch While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. This was on a pseudo distributed cluster. FWIW, I didn't see this happen on a fully-distributed cluster. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3846) alter view rename NPEs with authorization on.
[ https://issues.apache.org/jira/browse/HIVE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3846: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Teddy! alter view rename NPEs with authorization on. - Key: HIVE-3846 URL: https://issues.apache.org/jira/browse/HIVE-3846 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.10.0, 0.11.0 Reporter: Ashutosh Chauhan Assignee: Teddy Choi Fix For: 0.12.0 Attachments: HIVE-3846.1.patch.txt, HIVE-3846.2.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4615) Invalid column names allowed when created dynamically by a SerDe
[ https://issues.apache.org/jira/browse/HIVE-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4615: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gabriel! Invalid column names allowed when created dynamically by a SerDe Key: HIVE-4615 URL: https://issues.apache.org/jira/browse/HIVE-4615 Project: Hive Issue Type: Bug Reporter: Gabriel Reid Assignee: Gabriel Reid Fix For: 0.12.0 Attachments: HIVE-4615.1.patch.txt When a SerDe creates columns dynamically during table creation, there is no checking done on the validity of the created column names. This means that it's possible to create a table that contains columns that can't be queried, and will lead to issues when trying to query the created table. The same column name validation should be performed for dynamically-created columns as for other column names. This behavior can be easily tested using the TestSerDe, and including a column name that includes an invalid identifier character (e.g. a period) in the list of columns to create. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode
[ https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673184#comment-13673184 ] Navis commented on HIVE-4620: - +1, running test. MR temp directory conflicts in case of parallel execution mode -- Key: HIVE-4620 URL: https://issues.apache.org/jira/browse/HIVE-4620 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch In parallel query execution mode, all the parallel running task ends up sharing the same temp/scratch directory. This could lead to file conflicts and temp files getting deleted before the job completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 2125 - Still Failing
Changes for Build #2095 Changes for Build #2096 Changes for Build #2097 [cws] HIVE-4530. Enforce minmum ant version required in build script (Arup Malakar via cws) [omalley] Preparing RELEASE_NOTES for Hive 0.11.0rc2. Changes for Build #2098 [omalley] Update release notes for 0.11.0rc2 [omalley] HIVE-4527 Fix eclipse project template (Carl Steinbach via omalley) [omalley] HIVE-4505 Hive can't load transforms with remote scripts. (Prasad Majumdar and Gunther Hagleitner via omalley) [omalley] HIVE-4498 TestBeeLineWithArgs.testPositiveScriptFile fails (Thejas Nair via omalley) Changes for Build #2099 Changes for Build #2100 Changes for Build #2101 Changes for Build #2102 Changes for Build #2103 [daijy] PIG-2955: Fix bunch of Pig e2e tests on Windows Changes for Build #2104 [daijy] PIG-3069: Native Windows Compatibility for Pig E2E Tests and Harness Changes for Build #2105 [omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions (Gunther Hagleitner via omalley) [omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther Hagleitner via omalley) Changes for Build #2106 Changes for Build #2107 [omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there are many partitions (Gopal V via omalley) Changes for Build #2108 Changes for Build #2109 Changes for Build #2110 Changes for Build #2111 [omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther Hagleitner via omalley) [omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther Hagleitner via omalley) Changes for Build #2112 Changes for Build #2113 [gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) Changes for Build #2114 [gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table formatting (gates) Changes for Build #2115 Changes for Build #2116 [navis] JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL (Richard Ding via Navis) Changes for Build #2117 Changes for Build #2118 Changes for Build #2119 Changes for Build #2120 Changes for Build #2121 [navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to un-selected join keys in columnExprMap (Yin Huai via Navis) [navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when mapjoin.mapreduce=true (Gunther Hagleitner via Navis) Changes for Build #2122 Changes for Build #2123 Changes for Build #2124 [gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty Leverenz via gates) Changes for Build #2125 [daijy] PIG-3337: Fix remaining Window e2e tests All tests passed The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2125) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2125/ to view the results.
[jira] [Created] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2
Hari Sekhon created HIVE-4648: - Summary: Add ability to set hadoop conf overrides in JDBC for HiveServer2 Key: HIVE-4648 URL: https://issues.apache.org/jira/browse/HIVE-4648 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.10.0 Reporter: Hari Sekhon It's possible in BeeLine to specify set command overides of hadoop config variables, but I haven't seen any example code of how to do this in JDBC with HiveServer2. We need an ability to specify hadoop conf overrides on a per session basis or even half way through the session. See this Hive ticket for some background: https://issues.apache.org/jira/browse/HIVE-4644 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated HIVE-4648: -- Component/s: HiveServer2 Add ability to set hadoop conf overrides in JDBC for HiveServer2 Key: HIVE-4648 URL: https://issues.apache.org/jira/browse/HIVE-4648 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Hari Sekhon It's possible in BeeLine to specify set command overides of hadoop config variables, but I haven't seen any example code of how to do this in JDBC with HiveServer2. We need an ability to specify hadoop conf overrides on a per session basis or even half way through the session. See this Hive ticket for some background: https://issues.apache.org/jira/browse/HIVE-4644 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4585) Remove unused MR Temp file localization from Tasks
[ https://issues.apache.org/jira/browse/HIVE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673324#comment-13673324 ] Ashutosh Chauhan commented on HIVE-4585: I think it makes sense to remove this piece of code. Executing query locally (instead of on cluster) isn't the common use case for Hive. So, unless anyone really is interested in optimizing that code path, its better to get rid of it to lessen our technical debt. +1 Remove unused MR Temp file localization from Tasks -- Key: HIVE-4585 URL: https://issues.apache.org/jira/browse/HIVE-4585 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4585.1.patch HIVE-1408 introduced code that is currently commented out (i.e.: dead code), with a comment saying needs further development (HIVE-1484). It's been like this for close to 3 years. I suggest removing the code until such time that someone picks up that work. At that time they can decide if they want to use this code or pursue another route (FS shim?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Fwd: error in running the hive test cases
Hi, When I run the hive test case, I keep getting the following error: [echo] Project: serde [javac] Compiling 36 source files to /home/john/dev/hive-0.9.0-Intel/src/build/serde/test/classes [javac] TestAvroSerdeUtils.java:24: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: package org.apache.hadoop.hdfs [javac] import org.apache.hadoop.hdfs.MiniDFSCluster; [javac] ^ [javac] TestAvroSerdeUtils.java:184: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: class org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils [javac] MiniDFSCluster miniDfs = null; [javac] ^ [javac] TestAvroSerdeUtils.java:187: cannot find symbol [javac] symbol : class MiniDFSCluster [javac] location: class org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils [javac] miniDfs = new MiniDFSCluster(new Configuration(), 1, true, null); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. I am building hive 0.9 and running the test using ant package test. can someone give me a pointer, which jar is missing from classpath and how to resolve it. Thanks
[jira] [Commented] (HIVE-4418) TestNegativeCliDriver failure message if cmd succeeds is misleading
[ https://issues.apache.org/jira/browse/HIVE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673348#comment-13673348 ] Ashutosh Chauhan commented on HIVE-4418: +1 TestNegativeCliDriver failure message if cmd succeeds is misleading --- Key: HIVE-4418 URL: https://issues.apache.org/jira/browse/HIVE-4418 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.10.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4418.1.patch If the .q test ends up succeeding (exit code == 0), then the test failure message is misleading. From the error it seems as if the command actually failed - {code} [junit] junit.framework.AssertionFailedError: Client Execution failed with error code = 0 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestNegativeCliDriver.runTest(TestNegativeCliDriver.java:121) [junit] at org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_desc_tab(TestNegativeCliDriver.java:102) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-4547: A complex create view statement fails with new Antlr 3.4
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11084/#review21332 --- Ship it! LGTM. +1 (non-binding). - Shreepadma Venugopalan On May 13, 2013, 9:06 a.m., Prasad Mujumdar wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11084/ --- (Updated May 13, 2013, 9:06 a.m.) Review request for hive and Ashutosh Chauhan. Description --- The parser has a translation map where its possible to replace all the text with the appropriate escaped version in case of a view creation. This holds all individual translations and where they apply in the view definition. The newer antlr version seems to be more restrictive and throws assertion if there's an overlaps in these escape positions. The original patch for antlr upgrade added a check to take care of some of the simpler overlap cases found by unit tests. There are few more scenarios like the one in the customer case which are not covered. The patch includes Traverse the list of translation in a loop and look for all the possible overlaps. This addresses bug HIVE-4547. https://issues.apache.org/jira/browse/HIVE-4547 Diffs - data/files/v1.txt PRE-CREATION data/files/v2.txt PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/UnparseTranslator.java ec2c088 ql/src/test/queries/clientpositive/view_cast.q PRE-CREATION ql/src/test/results/clientpositive/view_cast.q.out PRE-CREATION Diff: https://reviews.apache.org/r/11084/diff/ Testing --- Ran full test suite. Added new test. Thanks, Prasad Mujumdar
[jira] [Commented] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673367#comment-13673367 ] Shreepadma Venugopalan commented on HIVE-4648: -- [~harisekhon]: It is possible to set and unset config variables through JDBC that can be set/unset through the command line. To do so, you'd need to do an execute statement with set config.var = value. To set the scratch dir, you can do the following in JDBC, {noformat} statement.execute(set hive.exec.scratchdir = /tmp/mydir); {noformat} Note that this property is set for the particular JDBC connection. Add ability to set hadoop conf overrides in JDBC for HiveServer2 Key: HIVE-4648 URL: https://issues.apache.org/jira/browse/HIVE-4648 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Hari Sekhon It's possible in BeeLine to specify set command overides of hadoop config variables, but I haven't seen any example code of how to do this in JDBC with HiveServer2. We need an ability to specify hadoop conf overrides on a per session basis or even half way through the session. See this Hive ticket for some background: https://issues.apache.org/jira/browse/HIVE-4644 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673369#comment-13673369 ] Shreepadma Venugopalan commented on HIVE-4648: -- Please note that setting hive.exec.scratchdir is just an example of doing sets through JDBC. Add ability to set hadoop conf overrides in JDBC for HiveServer2 Key: HIVE-4648 URL: https://issues.apache.org/jira/browse/HIVE-4648 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Hari Sekhon It's possible in BeeLine to specify set command overides of hadoop config variables, but I haven't seen any example code of how to do this in JDBC with HiveServer2. We need an ability to specify hadoop conf overrides on a per session basis or even half way through the session. See this Hive ticket for some background: https://issues.apache.org/jira/browse/HIVE-4644 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization
[ https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-2206: --- Affects Version/s: (was: 0.10.0) 0.12.0 Status: In Progress (was: Patch Available) add a new optimizer for query correlation discovery and optimization Key: HIVE-2206 URL: https://issues.apache.org/jira/browse/HIVE-2206 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.12.0 Reporter: He Yongqiang Assignee: Yin Huai Attachments: HIVE-2206.10-r1384442.patch.txt, HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt, HIVE-2206.17-r1404933.patch.txt, HIVE-2206.18-r1407720.patch.txt, HIVE-2206.19-r1410581.patch.txt, HIVE-2206.1.patch.txt, HIVE-2206.20-r1434012.patch.txt, HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, HIVE-2206.8-r1237253.patch.txt, testQueries.2.q, YSmartPatchForHive.patch This issue proposes a new logical optimizer called Correlation Optimizer, which is used to merge correlated MapReduce jobs (MR jobs) into a single MR job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The paper and slides of YSmart are linked at the bottom. Since Hive translates queries in a sentence by sentence fashion, for every operation which may need to shuffle the data (e.g. join and aggregation operations), Hive will generate a MapReduce job for that operation. However, for those operations which may need to shuffle the data, they may involve correlations explained below and thus can be executed in a single MR job. # Input Correlation: Multiple MR jobs have input correlation (IC) if their input relation sets are not disjoint; # Transit Correlation: Multiple MR jobs have transit correlation (TC) if they have not only input correlation, but also the same partition key; # Job Flow Correlation: An MR has job flow correlation (JFC) with one of its child nodes if it has the same partition key as that child node. The current implementation of correlation optimizer only detect correlations among MR jobs for reduce-side join operators and reduce-side aggregation operators (not map only aggregation). A query will be optimized if it satisfies following conditions. # There exists a MR job for reduce-side join operator or reduce side aggregation operator which have JFC with all of its parents MR jobs (TCs will be also exploited if JFC exists); # All input tables of those correlated MR job are original input tables (not intermediate tables generated by sub-queries); and # No self join is involved in those correlated MR jobs. Correlation optimizer is implemented as a logical optimizer. The main reasons are that it only needs to manipulate the query plan tree and it can leverage the existing component on generating MR jobs. Current implementation can serve as a framework for correlation related optimizations. I think that it is better than adding individual optimizers. There are several work that can be done in future to improve this optimizer. Here are three examples. # Support queries only involve TC; # Support queries in which input tables of correlated MR jobs involves intermediate tables; and # Optimize queries involving self join. References: Paper and presentation of YSmart. Paper: http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf Slides: http://sdrv.ms/UpwJJc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4451) Add support for string column type vector aggregates: COUNT, MIN and MAX
[ https://issues.apache.org/jira/browse/HIVE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4451: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks, Remus! Add support for string column type vector aggregates: COUNT, MIN and MAX Key: HIVE-4451 URL: https://issues.apache.org/jira/browse/HIVE-4451 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: vectorization-branch Attachments: HIVE-4451.0.patch.txt Extend the vector aggregates operations to support string types. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4592) fix failure to set output isNull to true and other NULL propagation issues; update arithmetic tests
[ https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4592: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks, Eric! fix failure to set output isNull to true and other NULL propagation issues; update arithmetic tests --- Key: HIVE-4592 URL: https://issues.apache.org/jira/browse/HIVE-4592 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4592.1.patch, HIVE-4592.3.patch, HIVE-4592.4.patch ColumnArithmeticColumn.txt should set the output column's noNulls flag to true if neither input column has nulls, but it does not do that. This can lead to wrong results is the noNulls was set to false in a previous use of the batch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4612) Vectorized aggregates do not emit proper rows in presence of GROUP BY
[ https://issues.apache.org/jira/browse/HIVE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673403#comment-13673403 ] Ashutosh Chauhan commented on HIVE-4612: Not specific to this patch, but VectorHashKeyWrapperBatch.java should be in vector package (instead of exec). Can you file a follow-up jira to move that file? Vectorized aggregates do not emit proper rows in presence of GROUP BY - Key: HIVE-4612 URL: https://issues.apache.org/jira/browse/HIVE-4612 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: vectorization-branch Attachments: HIVE-4612.0.patch.txt, HIVE-4612.1.patch.txt I discovered this while testing the fix for HIVE-4451 and HIVE-4452. The VGBy is emitting appropriate number of rows, but the row-mode ReduceSinkOperatoir only logs one row and the final result is incomplete. Investigating. Related to HIVE-4599. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4612) Vectorized aggregates do not emit proper rows in presence of GROUP BY
[ https://issues.apache.org/jira/browse/HIVE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4612: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks, Remus! Vectorized aggregates do not emit proper rows in presence of GROUP BY - Key: HIVE-4612 URL: https://issues.apache.org/jira/browse/HIVE-4612 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: vectorization-branch Attachments: HIVE-4612.0.patch.txt, HIVE-4612.1.patch.txt I discovered this while testing the fix for HIVE-4451 and HIVE-4452. The VGBy is emitting appropriate number of rows, but the row-mode ReduceSinkOperatoir only logs one row and the final result is incomplete. Investigating. Related to HIVE-4599. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4608) Vectorized UDFs for Timestamp in nanoseconds
[ https://issues.apache.org/jira/browse/HIVE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673413#comment-13673413 ] Ashutosh Chauhan commented on HIVE-4608: [~gopalv] Patch is not applying cleanly on top of svn branch with patch (tried both -p0 and -p1). Can you regenerate so that it applies cleanly on svn vectorization branch? Vectorized UDFs for Timestamp in nanoseconds Key: HIVE-4608 URL: https://issues.apache.org/jira/browse/HIVE-4608 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Gopal V Assignee: Gopal V Priority: Minor Labels: vectorization Attachments: 0001-Vectorized-UDFs-for-timestamp-functions-which-accept.patch, 0002-Update-patch-to-the-review-comments-in-https-reviews.patch Vectorized UDFs for timestamp functions which accept long vectors VectorUDFYearLong VectorUDFMonthLong VectorUDFWeekOfYearLong VectorUDFDayOfMonthLong VectorUDFHourLong VectorUDFMinuteLong VectorUDFSecondLong VectorUDFUnixTimeStampLong and tests for them against their non-vectorized implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-4648. -- Resolution: Not A Problem Add ability to set hadoop conf overrides in JDBC for HiveServer2 Key: HIVE-4648 URL: https://issues.apache.org/jira/browse/HIVE-4648 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Hari Sekhon It's possible in BeeLine to specify set command overides of hadoop config variables, but I haven't seen any example code of how to do this in JDBC with HiveServer2. We need an ability to specify hadoop conf overrides on a per session basis or even half way through the session. See this Hive ticket for some background: https://issues.apache.org/jira/browse/HIVE-4644 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673473#comment-13673473 ] Hari Sekhon commented on HIVE-4648: --- Thanks. Is there a particular doc that I missed? Add ability to set hadoop conf overrides in JDBC for HiveServer2 Key: HIVE-4648 URL: https://issues.apache.org/jira/browse/HIVE-4648 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 0.10.0 Reporter: Hari Sekhon It's possible in BeeLine to specify set command overides of hadoop config variables, but I haven't seen any example code of how to do this in JDBC with HiveServer2. We need an ability to specify hadoop conf overrides on a per session basis or even half way through the session. See this Hive ticket for some background: https://issues.apache.org/jira/browse/HIVE-4644 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673508#comment-13673508 ] Shreepadma Venugopalan commented on HIVE-4629: -- [~cwsteinbach]: Can you look at this? Thanks! HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-4513 - disable hivehistory logs by default
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11029/#review21352 --- data/conf/hive-site.xml https://reviews.apache.org/r/11029/#comment44263 Is there a reason for this to be set to true for tests? Unless there is, we should set config in tests to the default values, since we should test default configs. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java https://reviews.apache.org/r/11029/#comment44264 doesn't read right. I guess you wanted ... statistics into a file. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java https://reviews.apache.org/r/11029/#comment44266 This is existing comment which doesnt read right. But since we are doing major surgery on HiveHistory, it will be good to update to make it more sensible. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java https://reviews.apache.org/r/11029/#comment44268 I think word job is not required in this comment. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java https://reviews.apache.org/r/11029/#comment44269 I think query is a better word than job here. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java https://reviews.apache.org/r/11029/#comment44270 Better worded as Called at the end of query. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java https://reviews.apache.org/r/11029/#comment44271 Again use of word job is confusing, we shall use query here as well. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java https://reviews.apache.org/r/11029/#comment44272 Incorrect comment. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java https://reviews.apache.org/r/11029/#comment44274 Function name is IdtoTable, but comment says table to id. One of this needs to be corrected. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44275 Similar comment as in HiveHistory.java ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44277 Should this be hive.ql.exec.HiveHistoryImpl to avoid confusion? ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44278 and instead of an ? ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44280 In case of incorrect config, should this throw an exception instead of silent return, otherwise there will be errors later when something is tried to be written in history file. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44281 Same comment as above. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44283 This should be static class variable, otherwise nextInt() will return same value for each invocation. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44284 Instead of / we shall use File.Seprator ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44287 Consider using File.createNewFile here. ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44288 Use System.getProperty(line.separator) instead of \n ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java https://reviews.apache.org/r/11029/#comment44289 start of query ? ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java https://reviews.apache.org/r/11029/#comment44291 Missing apache header ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java https://reviews.apache.org/r/11029/#comment44292 HiveHistoryViewer.class - Ashutosh Chauhan On May 13, 2013, 10:12 p.m., Thejas Nair wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11029/ --- (Updated May 13, 2013, 10:12 p.m.) Review request for hive. Description --- HiveHistory log files (hive_job_log_hive_*.txt files) store information about hive query such as query string, plan , counters and MR job progress information. There is no mechanism to delete these files and as a result they get accumulated over time, using up lot of disk space. I don't think this is used by most people, so I think it would better to turn this off by default. Jobtracker logs already capture most of this information, though it is not as structured as history logs.
[jira] [Created] (HIVE-4649) Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation
Jitendra Nath Pandey created HIVE-4649: -- Summary: Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation Key: HIVE-4649 URL: https://issues.apache.org/jira/browse/HIVE-4649 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey The test fails due to bug in ColumnCompareScalar.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4513) disable hivehistory logs by default
[ https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673559#comment-13673559 ] Ashutosh Chauhan commented on HIVE-4513: [~thejas] Left some comments on RB. disable hivehistory logs by default --- Key: HIVE-4513 URL: https://issues.apache.org/jira/browse/HIVE-4513 Project: Hive Issue Type: Bug Components: Configuration, Logging Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, HIVE-4513.4.patch HiveHistory log files (hive_job_log_hive_*.txt files) store information about hive query such as query string, plan , counters and MR job progress information. There is no mechanism to delete these files and as a result they get accumulated over time, using up lot of disk space. I don't think this is used by most people, so I think it would better to turn this off by default. Jobtracker logs already capture most of this information, though it is not as structured as history logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4513) disable hivehistory logs by default
[ https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4513: --- Status: Open (was: Patch Available) Canceling patch for now. disable hivehistory logs by default --- Key: HIVE-4513 URL: https://issues.apache.org/jira/browse/HIVE-4513 Project: Hive Issue Type: Bug Components: Configuration, Logging Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, HIVE-4513.4.patch HiveHistory log files (hive_job_log_hive_*.txt files) store information about hive query such as query string, plan , counters and MR job progress information. There is no mechanism to delete these files and as a result they get accumulated over time, using up lot of disk space. I don't think this is used by most people, so I think it would better to turn this off by default. Jobtracker logs already capture most of this information, though it is not as structured as history logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4343) HS2 with kerberos- local task for map join fails
[ https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4343: --- Status: Open (was: Patch Available) This will be redundant if we get in HIVE-4470 HS2 with kerberos- local task for map join fails Key: HIVE-4343 URL: https://issues.apache.org/jira/browse/HIVE-4343 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4343.1.patch With hive server2 configured with kerberos security, when a (map) join query is run, it results in failure with GSSException: No valid credentials provided -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4171) Current database in metastore.Hive is not consistent with SessionState
[ https://issues.apache.org/jira/browse/HIVE-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4171: --- Status: Open (was: Patch Available) Canceling patch for now. Current database in metastore.Hive is not consistent with SessionState -- Key: HIVE-4171 URL: https://issues.apache.org/jira/browse/HIVE-4171 Project: Hive Issue Type: Bug Components: CLI Reporter: Navis Assignee: Thejas M Nair Labels: HiveServer2 Attachments: HIVE-4171.3.patch, HIVE-4171.4.patch, HIVE-4171.D9399.1.patch, HIVE-4171.D9399.2.patch metastore.Hive is thread local instance, which can have different status with SessionState. Currently the only status in metastore.Hive is database name in use. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4611) SMB joins fail based on bigtable selection policy.
[ https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673573#comment-13673573 ] Ashutosh Chauhan commented on HIVE-4611: [~vikram.dixit] Is this ready for review? If so, can you create a phabricator or RB link and mark it patch available. SMB joins fail based on bigtable selection policy. -- Key: HIVE-4611 URL: https://issues.apache.org/jira/browse/HIVE-4611 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.11.1 Attachments: HIVE-4611.patch The default setting for hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the big table as the one with largest average partition size. However, this can result in a query failing because this policy conflicts with the big table candidates chosen for outer joins. This policy should just be a tie breaker and not have the ultimate say in the choice of tables. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4502) NPE - subquery smb joins fails
[ https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4502: --- Assignee: Vikram Dixit K (was: Navis) NPE - subquery smb joins fails -- Key: HIVE-4502 URL: https://issues.apache.org/jira/browse/HIVE-4502 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, smb_mapjoin_25.q Found this issue while running some SMB joins. Attaching test case that causes this error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4502) NPE - subquery smb joins fails
[ https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673578#comment-13673578 ] Ashutosh Chauhan commented on HIVE-4502: [~navis] Would you like to take a look at Vikram's patch? I think if we can retain SMBJoin instead of converting them to reduce-side join, thats better. NPE - subquery smb joins fails -- Key: HIVE-4502 URL: https://issues.apache.org/jira/browse/HIVE-4502 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, smb_mapjoin_25.q Found this issue while running some SMB joins. Attaching test case that causes this error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4649) Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation
[ https://issues.apache.org/jira/browse/HIVE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-4649: --- Attachment: HIVE-4649.1.patch Attached patch fixes the issue. Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation - Key: HIVE-4649 URL: https://issues.apache.org/jira/browse/HIVE-4649 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-4649.1.patch The test fails due to bug in ColumnCompareScalar.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3876) call resetValid instead of ensureCapacity in the constructor of BytesRefArrayWritable
[ https://issues.apache.org/jira/browse/HIVE-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673590#comment-13673590 ] Ashutosh Chauhan commented on HIVE-3876: [~yhuai] Are you still working on this? Or shall we close this as Not A Problem? call resetValid instead of ensureCapacity in the constructor of BytesRefArrayWritable - Key: HIVE-3876 URL: https://issues.apache.org/jira/browse/HIVE-3876 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: Yin Huai Assignee: Yin Huai Priority: Minor Attachments: HIVE-3876.1.patch.txt In the constructor of BytesRefArrayWritable, ensureCapacity(capacity) is called, but valid has not been adjusted accordingly. After a new BytesRefArrayWritable has been created with a initial capacity of x, if resetValid() has not been called explicitly, the size returned is still 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673595#comment-13673595 ] Ashutosh Chauhan commented on HIVE-4435: Sorry for the delay. +1 Will commit if tests pass. Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4649) Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation
[ https://issues.apache.org/jira/browse/HIVE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-4649. Resolution: Fixed Fix Version/s: vectorization-branch Committed to branch. Thanks, Jitendra! Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation - Key: HIVE-4649 URL: https://issues.apache.org/jira/browse/HIVE-4649 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: vectorization-branch Attachments: HIVE-4649.1.patch The test fails due to bug in ColumnCompareScalar.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673599#comment-13673599 ] Shreepadma Venugopalan commented on HIVE-4435: -- Thanks Ashutosh! Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4535) hive build fails with hadoop 0.20
[ https://issues.apache.org/jira/browse/HIVE-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673610#comment-13673610 ] Hudson commented on HIVE-4535: -- Integrated in Hive-trunk-hadoop2 #223 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/223/]) HIVE-4535 : hive build fails with hadoop 0.20 (Thejas Nair via Ashutosh Chauhan) (Revision 1488739) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488739 Files : * /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java hive build fails with hadoop 0.20 - Key: HIVE-4535 URL: https://issues.apache.org/jira/browse/HIVE-4535 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4535.1.patch, HIVE-4535.2.patch ant package -Dhadoop.mr.rev=20 leads to - {code} [javac] /Users/thejas/hive_thejas_git/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java:382: cannot find symbol [javac] symbol : method join(java.lang.String,java.util.Listjava.lang.String) [javac] location: class org.apache.hadoop.util.StringUtils [javac] StringUtils.join(,, incompatibleCols) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4562) HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar
[ https://issues.apache.org/jira/browse/HIVE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673609#comment-13673609 ] Hudson commented on HIVE-4562: -- Integrated in Hive-trunk-hadoop2 #223 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/223/]) HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan) (Revision 1488744) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488744 Files : * /hive/trunk/ql/build.xml HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar Key: HIVE-4562 URL: https://issues.apache.org/jira/browse/HIVE-4562 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.10.0, 0.11.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4562-1.patch, HIVE-4562-2.patch Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4510) HS2 doesn't nest exceptions properly (fun debug times)
[ https://issues.apache.org/jira/browse/HIVE-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673608#comment-13673608 ] Hudson commented on HIVE-4510: -- Integrated in Hive-trunk-hadoop2 #223 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/223/]) HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) (Thejas Nair via Ashutosh Chauhan) (Revision 1488740) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488740 Files : * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java * /hive/trunk/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java HS2 doesn't nest exceptions properly (fun debug times) -- Key: HIVE-4510 URL: https://issues.apache.org/jira/browse/HIVE-4510 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Gunther Hagleitner Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4510.1.patch, HIVE-4510.2.patch In SQLOperation.java lines 97 + 113 for instance, we catch errors and throw a new HiveSQLException, but we don't wrap the original exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673612#comment-13673612 ] Ashutosh Chauhan commented on HIVE-4561: [~shreepadma] Since you wrote this originally, would you like to review this as well ? Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4610) HCatalog checkstyle violation after HIVE-4578
[ https://issues.apache.org/jira/browse/HIVE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673605#comment-13673605 ] Hudson commented on HIVE-4610: -- Integrated in Hive-trunk-hadoop2 #223 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/223/]) HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock Noland via Ashutosh Chauhan) (Revision 1488825) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488825 Files : * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/default.res * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/windows.res HCatalog checkstyle violation after HIVE-4578 - Key: HIVE-4610 URL: https://issues.apache.org/jira/browse/HIVE-4610 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4610-0.patch {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 413 files [checkstyle] /home/brock/workspaces/hive-apache/hive/hcatalog/src/test/e2e/hcatalog/resource/default.res:1: Missing a header - not enough lines in file. [checkstyle] /home/brock/workspaces/hive-apache/hive/hcatalog/src/test/e2e/hcatalog/resource/windows.res:1: Missing a header - not enough lines in file. [for] hcatalog: The following error occurred while executing this line: [for] /home/brock/workspaces/hive-apache/hive/build.xml:310: The following error occurred while executing this line: [for] /home/brock/workspaces/hive-apache/hive/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /home/brock/workspaces/hive-apache/hive/hcatalog/build-support/ant/checkstyle.xml:32: Got 2 errors and 0 warnings. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4636) Failing on TestSemanticAnalysis.testAddReplaceCols in trunk
[ https://issues.apache.org/jira/browse/HIVE-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673607#comment-13673607 ] Hudson commented on HIVE-4636: -- Integrated in Hive-trunk-hadoop2 #223 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/223/]) HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in trunk (Navis via Ashutosh Chauhan) (Revision 1488824) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488824 Files : * /hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestSemanticAnalysis.java Failing on TestSemanticAnalysis.testAddReplaceCols in trunk --- Key: HIVE-4636 URL: https://issues.apache.org/jira/browse/HIVE-4636 Project: Hive Issue Type: Test Components: Tests Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4636.D11013.1.patch Seemed regression from HIVE-4475. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673616#comment-13673616 ] Shreepadma Venugopalan commented on HIVE-4561: -- [~ashutoshc]: Sure, I'll take a look at this today. Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4547) A complex create view statement fails with new Antlr 3.4
[ https://issues.apache.org/jira/browse/HIVE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673619#comment-13673619 ] Ashutosh Chauhan commented on HIVE-4547: [~thiruvel] Would you like to review this, since you wrote this piece originally. A complex create view statement fails with new Antlr 3.4 Key: HIVE-4547 URL: https://issues.apache.org/jira/browse/HIVE-4547 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4547-1.patch, HIVE-4547-repro.tar A complex create view statement with CAST in join condition fails with IllegalArgumentException error. This is exposed by the Antlr 3.4 upgrade (HIVE-2439). The same statement works fine with Hive 0.9 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3876) call resetValid instead of ensureCapacity in the constructor of BytesRefArrayWritable
[ https://issues.apache.org/jira/browse/HIVE-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved HIVE-3876. Resolution: Not A Problem Sorry for not looking at it for a long time. I just took a look at the code. BytesRefArrayWritable is used by first ensureCapacity and then set valid in resetValid or set. If we use resetValid in the constructor, we can get those elements which are not valid, which should not be allowed. Let's close it as Not A Problem. call resetValid instead of ensureCapacity in the constructor of BytesRefArrayWritable - Key: HIVE-3876 URL: https://issues.apache.org/jira/browse/HIVE-3876 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: Yin Huai Assignee: Yin Huai Priority: Minor Attachments: HIVE-3876.1.patch.txt In the constructor of BytesRefArrayWritable, ensureCapacity(capacity) is called, but valid has not been adjusted accordingly. After a new BytesRefArrayWritable has been created with a initial capacity of x, if resetValid() has not been called explicitly, the size returned is still 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4245) Implement numeric dictionaries in ORC
[ https://issues.apache.org/jira/browse/HIVE-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673643#comment-13673643 ] Owen O'Malley commented on HIVE-4245: - Pam, Have you had a chance to work on this? Implement numeric dictionaries in ORC - Key: HIVE-4245 URL: https://issues.apache.org/jira/browse/HIVE-4245 Project: Hive Issue Type: New Feature Components: File Formats Reporter: Owen O'Malley Assignee: Pamela Vagata For many applications, especially in de-normalized data, there is a lot of redundancy in the numeric columns. Therefore, it would make sense to adaptively use dictionary encodings for numeric columns in addition to string columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation
[ https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673644#comment-13673644 ] Ashutosh Chauhan commented on HIVE-4546: +1 Hive CLI leaves behind the per session resource directory on non-interactive invocation --- Key: HIVE-4546 URL: https://issues.apache.org/jira/browse/HIVE-4546 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch As part of HIVE-4505, the resource directory is set to /tmp/${hive.session.id}_resources and suppose to be removed at the end. The CLI fails to remove it when invoked using -f or -e (non-interactive mode) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11335/ --- (Updated June 3, 2013, 10:18 p.m.) Review request for hive and Ashutosh Chauhan. Changes --- 1. Added data input file to the new test case that was missing from previous patch. 2. Please note that review board doesn't show the added data file name correctly because of the space in it. However, applying the patch to the code base has no issue. Description --- Patch includes fix and new test case. This addresses bug HIVE-4554. https://issues.apache.org/jira/browse/HIVE-4554 Diffs (updated) - data/files/person PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java bd8d252 ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q PRE-CREATION ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out PRE-CREATION Diff: https://reviews.apache.org/r/11335/diff/ Testing --- Thanks, Xuefu Zhang
[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces
[ https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4554: -- Attachment: HIVE-4554.patch.3 HIVE-4554.patch.3 is the same as HIVE-4554.patch.2 except that it includs the data input file for the new test case which was missing. All test case passed. RB request is here: https://reviews.apache.org/r/11335/ Failed to create a table from existing file if file path has spaces --- Key: HIVE-4554 URL: https://issues.apache.org/jira/browse/HIVE-4554 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, HIVE-4554.patch.3 To reproduce the problem, 1. Create a table, say, person_age (name STRING, age INT). 2. Create a file whose name has a space in it, say, data set.txt. 3. Try to load the date in the file to the table. The following error can be seen in the console: hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age; Loading data to table default.person_age Failed with exception Wrong file format. Please check the file's format. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask Note: the error message is confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces
[ https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-4554: -- Status: Patch Available (was: Open) Failed to create a table from existing file if file path has spaces --- Key: HIVE-4554 URL: https://issues.apache.org/jira/browse/HIVE-4554 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, HIVE-4554.patch.3 To reproduce the problem, 1. Create a table, say, person_age (name STRING, age INT). 2. Create a file whose name has a space in it, say, data set.txt. 3. Try to load the date in the file to the table. The following error can be seen in the console: hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age; Loading data to table default.person_age Failed with exception Wrong file format. Please check the file's format. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask Note: the error message is confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #165
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/165/ -- [...truncated 42329 lines...] [junit] Hadoop job information for null: number of mappers: 0; number of reducers: 0 [junit] 2013-06-03 15:40:56,227 null map = 100%, reduce = 100% [junit] Ended Job = job_local_0001 [junit] Execution completed successfully [junit] Mapred Local Task Succeeded . Convert the Join into MapJoin [junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-06-03_15-40-53_015_7476949543543173382/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201306031540_2137645541.txt [junit] Copying file: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] Table default.testhivedrivertable stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0] [junit] POSTHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-06-03_15-40-57_529_6053755246777448667/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-06-03_15-40-57_529_6053755246777448667/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201306031540_79631070.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable
[jira] [Commented] (HIVE-4547) A complex create view statement fails with new Antlr 3.4
[ https://issues.apache.org/jira/browse/HIVE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673739#comment-13673739 ] Thiruvel Thirumoolan commented on HIVE-4547: Sure, will take a look. A complex create view statement fails with new Antlr 3.4 Key: HIVE-4547 URL: https://issues.apache.org/jira/browse/HIVE-4547 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4547-1.patch, HIVE-4547-repro.tar A complex create view statement with CAST in join condition fails with IllegalArgumentException error. This is exposed by the Antlr 3.4 upgrade (HIVE-2439). The same statement works fine with Hive 0.9 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11335/#review21366 --- Patch looks good, apart from one comment. ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java https://reviews.apache.org/r/11335/#comment44301 Apart from this change, all other changes are contained within if(isLocal) block. Because of this it seems its possible it might be triggered for non-local paths as well. Can you test it for hdfs:// path which has spaces. If its easy, it will be good to add it in test, else manual test is fine as well. - Ashutosh Chauhan On June 3, 2013, 10:18 p.m., Xuefu Zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11335/ --- (Updated June 3, 2013, 10:18 p.m.) Review request for hive and Ashutosh Chauhan. Description --- Patch includes fix and new test case. This addresses bug HIVE-4554. https://issues.apache.org/jira/browse/HIVE-4554 Diffs - data/files/person PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java bd8d252 ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q PRE-CREATION ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out PRE-CREATION Diff: https://reviews.apache.org/r/11335/diff/ Testing --- Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-4554) Failed to create a table from existing file if file path has spaces
[ https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673759#comment-13673759 ] Ashutosh Chauhan commented on HIVE-4554: Comment on RB. Failed to create a table from existing file if file path has spaces --- Key: HIVE-4554 URL: https://issues.apache.org/jira/browse/HIVE-4554 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, HIVE-4554.patch.3 To reproduce the problem, 1. Create a table, say, person_age (name STRING, age INT). 2. Create a file whose name has a space in it, say, data set.txt. 3. Try to load the date in the file to the table. The following error can be seen in the console: hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age; Loading data to table default.person_age Failed with exception Wrong file format. Please check the file's format. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask Note: the error message is confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4566) NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established
[ https://issues.apache.org/jira/browse/HIVE-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673785#comment-13673785 ] Ashutosh Chauhan commented on HIVE-4566: As in original description, I like the idea of printing No current connection in such scenarios but I don't think current patch prints it. Can you modify your test to make sure that indeed gets printed? NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established - Key: HIVE-4566 URL: https://issues.apache.org/jira/browse/HIVE-4566 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-4566.patch Before a DB connection is established, executing a command such as typeinfo and nativesql results an NPE shown at the console: beeline !typeinfo java.lang.NullPointerException beeline !nativesql java.lang.NullPointerException Instead, a message, such as No current connection should be given, as in case of some other commands, such as dropall. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode
[ https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4620: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk, thanks Prasad! MR temp directory conflicts in case of parallel execution mode -- Key: HIVE-4620 URL: https://issues.apache.org/jira/browse/HIVE-4620 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch In parallel query execution mode, all the parallel running task ends up sharing the same temp/scratch directory. This could lead to file conflicts and temp files getting deleted before the job completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode
[ https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673853#comment-13673853 ] Prasad Mujumdar commented on HIVE-4620: --- Thanks Navis! MR temp directory conflicts in case of parallel execution mode -- Key: HIVE-4620 URL: https://issues.apache.org/jira/browse/HIVE-4620 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch In parallel query execution mode, all the parallel running task ends up sharing the same temp/scratch directory. This could lead to file conflicts and temp files getting deleted before the job completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4502) NPE - subquery smb joins fails
[ https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673856#comment-13673856 ] Navis commented on HIVE-4502: - I've not converted any SMBJoins to RS-joins and just changed creation order of those. The difference is that my patch adds a root task only when all of the join aliases are handled, which is contrary to trunk (add root whenever possible and remove if it's not afterwards). The patch I've attached seemed easier but it is just my call. NPE - subquery smb joins fails -- Key: HIVE-4502 URL: https://issues.apache.org/jira/browse/HIVE-4502 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, smb_mapjoin_25.q Found this issue while running some SMB joins. Attaching test case that causes this error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11172/#review21391 --- Ship it! Ship It! - Shreepadma Venugopalan On June 3, 2013, 4:46 a.m., Zhuoluo Yang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11172/ --- (Updated June 3, 2013, 4:46 a.m.) Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, and fangkun cao. Description --- An initialization error. Make double and long initialize correctly. Would you review that and assign the issue to me? This addresses bug HIVE-4561. https://issues.apache.org/jira/browse/HIVE-4561 Diffs - http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java 1488823 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out 1488823 Diff: https://reviews.apache.org/r/11172/diff/ Testing --- ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q done. Thanks, Zhuoluo Yang
[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)
[ https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673887#comment-13673887 ] Shreepadma Venugopalan commented on HIVE-4561: -- LGTM! +1 (non-binding). Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the column values larger than 0.0 (or if all column values smaller than 0.0) Key: HIVE-4561 URL: https://issues.apache.org/jira/browse/HIVE-4561 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.12.0 Reporter: caofangkun Assignee: Zhuoluo (Clark) Yang Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch if all column values larger than 0.0 DOUBLE_LOW_VALUE always will be 0.0 or if all column values less than 0.0, DOUBLE_HIGH_VALUE will always be hive (default) create table src_test (price double); hive (default) load data local inpath './test.txt' into table src_test; hive (default) select * from src_test; OK 1.0 2.0 3.0 Time taken: 0.313 seconds, Fetched: 3 row(s) hive (default) analyze table src_test compute statistics for columns price; mysql select * from TAB_COL_STATS \G; CS_ID: 16 DB_NAME: default TABLE_NAME: src_test COLUMN_NAME: price COLUMN_TYPE: double TBL_ID: 2586 LONG_LOW_VALUE: 0 LONG_HIGH_VALUE: 0 DOUBLE_LOW_VALUE: 0. # Wrong Result ! Expected is 1. DOUBLE_HIGH_VALUE: 3. BIG_DECIMAL_LOW_VALUE: NULL BIG_DECIMAL_HIGH_VALUE: NULL NUM_NULLS: 0 NUM_DISTINCTS: 1 AVG_COL_LEN: 0. MAX_COL_LEN: 0 NUM_TRUES: 0 NUM_FALSES: 0 LAST_ANALYZED: 1368596151 2 rows in set (0.00 sec) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: ask for cube
Hi, Guowp Maybe the following wiki helps you. Hive supports cube after Hive-0.10.0 https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup Cheers, Zhuoluo (Clark) Yang 2013/5/30 guowp gu...@asiainfo-linkage.com Sorry ,the cobe is “cube”, the mistake was hanppening by my careless 发件人: guowp [mailto:gu...@asiainfo-linkage.com] 发送时间: 2013年5月30日 11:43 收件人: 'dev@hive.apache.org' 主题: ask for cobe Hi sir, I am the developer of AsiainfoLinkage , the sql “select … … group by cube (colum1,colum2),colum3” used by oracle, but in the hive,we only use the hsql “ select … … group by colum1,colum2,colum3 with cube” 。 I wan to know: 1. Whether the hive will support the cube like “group by cube (colum1, colum2),colum3” 2. If the hive will support ,the plan ? which version will it work . Thanks , With best wishes . By Guowp , Asiainfo Linkage
[jira] [Updated] (HIVE-4032) Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException
[ https://issues.apache.org/jira/browse/HIVE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] caofangkun updated HIVE-4032: - Assignee: caofangkun Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException -- Key: HIVE-4032 URL: https://issues.apache.org/jira/browse/HIVE-4032 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Environment: Apache Hadoop 0.19.1 + Apache Hive 0.10.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Attachments: HIVE-4032-1.patch Inserting data into Hive table from a query , when the query is : select * from a_partitioned_table, will throw a SemanticException . It seems that * contains the virtual partition columns. drop table if exists zr_test; create table if not exists zr_test (key string, value string) partitioned by (dt string); drop table if exists zr_test_1; create table if not exists zr_test_1 (key string, value string) partitioned by (dt string); --Query One explain insert into table zr_test partition (dt='20130217') select key, value from zr_test_1 where dt='20130217'; --Query Two explain insert into table zr_test partition (dt='20130217') select * from zr_test_1 where dt='20130217'; Ouery One works well, bug Query Two failed with the following information: FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target table because column number/types are different ''20130217'': Table insclause-0 has 2 columns, but query has 3 columns. p.s: Query Two works well on Apache Hadoop 0.20.1 + Hive 0.10.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4502) NPE - subquery smb joins fails
[ https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-4502: - Assignee: Navis (was: Vikram Dixit K) NPE - subquery smb joins fails -- Key: HIVE-4502 URL: https://issues.apache.org/jira/browse/HIVE-4502 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Navis Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, smb_mapjoin_25.q Found this issue while running some SMB joins. Attaching test case that causes this error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4502) NPE - subquery smb joins fails
[ https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-4502: - Attachment: smb_mapjoin_25.q NPE - subquery smb joins fails -- Key: HIVE-4502 URL: https://issues.apache.org/jira/browse/HIVE-4502 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Navis Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, smb_mapjoin_25.q, smb_mapjoin_25.q Found this issue while running some SMB joins. Attaching test case that causes this error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4502) NPE - subquery smb joins fails
[ https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673946#comment-13673946 ] Vikram Dixit K commented on HIVE-4502: -- [~navis] I misread the results of the test case from your patch. I was going through your patch more meticulously and found that a few of the tests have different results. Particularly those in auto_sortmerge_join_6.q. The count results seem to have changed. HIVE-3891 converts SMB joins to map-joins when possible. Although that seems orthogonal to this change, any idea as to why the join is still SMB? Also attached a few more tests for this. The plans seem valid after applying your patch. I will continue to review the patch. NPE - subquery smb joins fails -- Key: HIVE-4502 URL: https://issues.apache.org/jira/browse/HIVE-4502 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Navis Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, smb_mapjoin_25.q, smb_mapjoin_25.q Found this issue while running some SMB joins. Attaching test case that causes this error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4650) Getting Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after upgrade to Hive-0.11.0.x from hive-0.10.0.x
Bruce Nelson created HIVE-4650: -- Summary: Getting Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after upgrade to Hive-0.11.0.x from hive-0.10.0.x Key: HIVE-4650 URL: https://issues.apache.org/jira/browse/HIVE-4650 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Environment: HortonWorks 1.3 distro on x86_64 Centos 6 Reporter: Bruce Nelson working from a simple table in Hive hive desc cmnt ; OK x1 int None x2 int None x3 int None x4 int None y double None hive select * from cmnt; OK 7 26 6 60 78.5 1 29 15 52 74.3 11 56 8 20 104.3 11 31 8 47 87.6 7 52 6 33 95.9 11 55 9 22 109.2 3 71 17 6 102.7 1 31 22 44 72.5 2 54 18 22 93.1 21 47 4 26 115.9 1 40 23 34 83.8 11 66 9 12 113.3 10 68 8 12 109.4 A query that joins and transforms against this table : select * from (select VAL001 x1,VAL002 x2,VAL003 x3,VAL004 x4,VAL005 y from ( select /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,(VAL002- mu2) * 1/(sd2) VAL002,(VAL003- mu3) * 1/(sd3) VAL003,(VAL004- mu4) * 1/(sd4) VAL004,(VAL005- mu5) * 1/(sd5) VAL005 from ( select * from ( select x1 VAL001,x2 VAL002,x3 VAL003,x4 VAL004,y VAL005 from cmnt ) obj1_3 ) v3 join (select count(*) c, avg(VAL001) mu1,avg(VAL002) mu2,avg(VAL003) mu3,avg(VAL004) mu4,avg(VAL005) mu5, stddev_pop(VAL001) sd1,stddev_pop(VAL002) sd2,stddev_pop(VAL003) sd3,stddev_pop(VAL004) sd4,stddev_pop(VAL005) sd5 from ( select * from ( select x1 VAL001,x2 VAL002,x3 VAL003,x4 VAL004,y VAL005 from cmnt ) obj1_3 ) v1) v2 ) obj1_7) obj1_6 ; Generates during Stage-3 : setting HADOOP_USER_NAMEtest Execution log at: /tmp/test/.log 2013-06-03 12:40:55 Starting to launch local task to process map join; maximum memory = 1065484288 2013-06-03 12:40:56 Processing rows:1 Hashtable size: 1 Memory usage: 7175528 rate: 0.007 2013-06-03 12:40:56 Dump the hashtable into file: file:/tmp/test/hive_2013-06-03_00-40-21_708_6820064283161196136/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable 2013-06-03 12:40:56 Upload 1 File to: file:/tmp/test/hive_2013-06-03_00-40-21_708_6820064283161196136/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable File size: 334 2013-06-03 12:40:56 End of local task; Time Taken: 0.726 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 2 out of 2 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201306022123_0045, Tracking URL = http://sun1vm3:50030/jobdetails.jsp?jobid=job_201306022123_0045 Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job -kill job_201306022123_0045 Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0 2013-06-03 00:41:05,895 Stage-3 map = 0%, reduce = 0% 2013-06-03 00:41:40,687 Stage-3 map = 100%, reduce = 100% Ended Job = job_201306022123_0045 with errors Error during job, obtaining debugging information... Job Tracking URL: http://sun1vm3:50030/jobdetails.jsp?jobid=job_201306022123_0045 Examining task ID: task_201306022123_0045_m_02 (and more) from job job_201306022123_0045 Task with the most failures(4): - Task ID: task_201306022123_0045_m_00 URL: http://sun1vm3:50030/taskdetails.jsp?jobid=job_201306022123_0045tipid=task_201306022123_0045_m_00 - Diagnostic Messages for this Task: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by:
[jira] [Commented] (HIVE-4650) Getting Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after upgrade to Hive-0.11.0.x from hive-0.10.0.x
[ https://issues.apache.org/jira/browse/HIVE-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673947#comment-13673947 ] Bruce Nelson commented on HIVE-4650: If hive.auto.convert.join = false is set then the all the query stages work OK. The same scenario worked OK in Hive-0.10.0.x and Hive-0.9.x with MapJoin working. Getting Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after upgrade to Hive-0.11.0.x from hive-0.10.0.x -- Key: HIVE-4650 URL: https://issues.apache.org/jira/browse/HIVE-4650 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Environment: HortonWorks 1.3 distro on x86_64 Centos 6 Reporter: Bruce Nelson working from a simple table in Hive hive desc cmnt ; OK x1 int None x2 int None x3 int None x4 int None y double None hive select * from cmnt; OK 7 26 6 60 78.5 1 29 15 52 74.3 11 56 8 20 104.3 11 31 8 47 87.6 7 52 6 33 95.9 11 55 9 22 109.2 3 71 17 6 102.7 1 31 22 44 72.5 2 54 18 22 93.1 21 47 4 26 115.9 1 40 23 34 83.8 11 66 9 12 113.3 10 68 8 12 109.4 A query that joins and transforms against this table : select * from (select VAL001 x1,VAL002 x2,VAL003 x3,VAL004 x4,VAL005 y from ( select /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,(VAL002- mu2) * 1/(sd2) VAL002,(VAL003- mu3) * 1/(sd3) VAL003,(VAL004- mu4) * 1/(sd4) VAL004,(VAL005- mu5) * 1/(sd5) VAL005 from ( select * from ( select x1 VAL001,x2 VAL002,x3 VAL003,x4 VAL004,y VAL005 from cmnt ) obj1_3 ) v3 join (select count(*) c, avg(VAL001) mu1,avg(VAL002) mu2,avg(VAL003) mu3,avg(VAL004) mu4,avg(VAL005) mu5, stddev_pop(VAL001) sd1,stddev_pop(VAL002) sd2,stddev_pop(VAL003) sd3,stddev_pop(VAL004) sd4,stddev_pop(VAL005) sd5 from ( select * from ( select x1 VAL001,x2 VAL002,x3 VAL003,x4 VAL004,y VAL005 from cmnt ) obj1_3 ) v1) v2 ) obj1_7) obj1_6 ; Generates during Stage-3 : setting HADOOP_USER_NAMEtest Execution log at: /tmp/test/.log 2013-06-03 12:40:55 Starting to launch local task to process map join; maximum memory = 1065484288 2013-06-03 12:40:56 Processing rows:1 Hashtable size: 1 Memory usage: 7175528 rate: 0.007 2013-06-03 12:40:56 Dump the hashtable into file: file:/tmp/test/hive_2013-06-03_00-40-21_708_6820064283161196136/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable 2013-06-03 12:40:56 Upload 1 File to: file:/tmp/test/hive_2013-06-03_00-40-21_708_6820064283161196136/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable File size: 334 2013-06-03 12:40:56 End of local task; Time Taken: 0.726 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 2 out of 2 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201306022123_0045, Tracking URL = http://sun1vm3:50030/jobdetails.jsp?jobid=job_201306022123_0045 Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job -kill job_201306022123_0045 Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0 2013-06-03 00:41:05,895 Stage-3 map = 0%, reduce = 0% 2013-06-03 00:41:40,687 Stage-3 map = 100%, reduce = 100% Ended Job = job_201306022123_0045 with errors Error during job, obtaining debugging information... Job Tracking URL: http://sun1vm3:50030/jobdetails.jsp?jobid=job_201306022123_0045 Examining task ID: task_201306022123_0045_m_02 (and more) from job job_201306022123_0045 Task with the most failures(4): - Task ID: task_201306022123_0045_m_00 URL: http://sun1vm3:50030/taskdetails.jsp?jobid=job_201306022123_0045tipid=task_201306022123_0045_m_00 - Diagnostic Messages for this Task: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162) at
[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673986#comment-13673986 ] Ashutosh Chauhan commented on HIVE-4435: Following tests failed: * compute_stats_double.q * compute_stats_long.q * compute_stats_string.q I am assuming since we have better estimates now, we just need to update .q.out files for these. Can you verify and if so can you update the patch with it? Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4418) TestNegativeCliDriver failure message if cmd succeeds is misleading
[ https://issues.apache.org/jira/browse/HIVE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4418: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Thejas! TestNegativeCliDriver failure message if cmd succeeds is misleading --- Key: HIVE-4418 URL: https://issues.apache.org/jira/browse/HIVE-4418 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.10.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4418.1.patch If the .q test ends up succeeding (exit code == 0), then the test failure message is misleading. From the error it seems as if the command actually failed - {code} [junit] junit.framework.AssertionFailedError: Client Execution failed with error code = 0 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestNegativeCliDriver.runTest(TestNegativeCliDriver.java:121) [junit] at org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_desc_tab(TestNegativeCliDriver.java:102) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4585) Remove unused MR Temp file localization from Tasks
[ https://issues.apache.org/jira/browse/HIVE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4585: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! Remove unused MR Temp file localization from Tasks -- Key: HIVE-4585 URL: https://issues.apache.org/jira/browse/HIVE-4585 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4585.1.patch HIVE-1408 introduced code that is currently commented out (i.e.: dead code), with a comment saying needs further development (HIVE-1484). It's been like this for close to 3 years. I suggest removing the code until such time that someone picks up that work. At that time they can decide if they want to use this code or pursue another route (FS shim?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4646) skewjoin.q is failing in hadoop2
[ https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673998#comment-13673998 ] Phabricator commented on HIVE-4646: --- ashutoshc has accepted the revision HIVE-4646 [jira] skewjoin.q is failing in hadoop2. Stuffing this in shim probably is cleaner, but feels like overkill. utility method is fine too. +1 REVISION DETAIL https://reviews.facebook.net/D11043 BRANCH HIVE-4646 ARCANIST PROJECT hive To: JIRA, ashutoshc, navis skewjoin.q is failing in hadoop2 Key: HIVE-4646 URL: https://issues.apache.org/jira/browse/HIVE-4646 Project: Hive Issue Type: Test Components: Query Processor Reporter: Navis Assignee: Navis Attachments: HIVE-4646.D11043.1.patch https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception instead of returning null for not-existing path. But skew resolver depends on old behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4377) Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)
[ https://issues.apache.org/jira/browse/HIVE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674000#comment-13674000 ] Phabricator commented on HIVE-4377: --- ashutoshc has accepted the revision HIVE-4377 [jira] Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340). +1 REVISION DETAIL https://reviews.facebook.net/D10377 BRANCH HIVE-4377 ARCANIST PROJECT hive To: JIRA, ashutoshc, navis Cc: njain Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340) -- Key: HIVE-4377 URL: https://issues.apache.org/jira/browse/HIVE-4377 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Gang Tim Liu Assignee: Navis Attachments: HIVE-4377.D10377.1.patch, HIVE-4377.D10377.2.patch, HIVE-4377.D10377.3.patch thanks a lot for addressing optimization in HIVE-2340. Awesome! Since we are developing at a very fast pace, it would be really useful to think about maintainability and testing of the large codebase. Highlights which are applicable for D1209: 1. Javadoc for all public/private functions, except for setters/getters. For any complex function, clear examples (input/output) would really help. 2. Specially, for query optimizations, it might be a good idea to have a simple working query at the top, and the expected changes. For e.g.. The operator tree for that query at each step, or a detailed explanation at the top. 3. If possible, the test name (.q file) where the function is being invoked, or the query which would potentially test that scenario, if it is a query processor change. 4. Comments in each test (.q file) that should include the jira number, what is it trying to test. Assumptions about each query. 5. Reduce the output for each test whenever query is outputting more than 10 results, it should have a reason. Otherwise, each query result should be bounded by 10 rows. thanks a lot -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4615) Invalid column names allowed when created dynamically by a SerDe
[ https://issues.apache.org/jira/browse/HIVE-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674016#comment-13674016 ] Hudson commented on HIVE-4615: -- Integrated in Hive-trunk-h0.21 #2126 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2126/]) HIVE-4615 : Invalid column names allowed when created dynamically by a SerDe (Gabriel Reid via Ashutosh Chauhan) (Revision 1489013) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489013 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java * /hive/trunk/ql/src/test/queries/clientnegative/invalid_columns.q * /hive/trunk/ql/src/test/results/clientnegative/invalid_columns.q.out Invalid column names allowed when created dynamically by a SerDe Key: HIVE-4615 URL: https://issues.apache.org/jira/browse/HIVE-4615 Project: Hive Issue Type: Bug Reporter: Gabriel Reid Assignee: Gabriel Reid Fix For: 0.12.0 Attachments: HIVE-4615.1.patch.txt When a SerDe creates columns dynamically during table creation, there is no checking done on the validity of the created column names. This means that it's possible to create a table that contains columns that can't be queried, and will lead to issues when trying to query the created table. The same column name validation should be performed for dynamically-created columns as for other column names. This behavior can be easily tested using the TestSerDe, and including a column name that includes an invalid identifier character (e.g. a period) in the list of columns to create. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4636) Failing on TestSemanticAnalysis.testAddReplaceCols in trunk
[ https://issues.apache.org/jira/browse/HIVE-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674010#comment-13674010 ] Hudson commented on HIVE-4636: -- Integrated in Hive-trunk-h0.21 #2126 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2126/]) HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in trunk (Navis via Ashutosh Chauhan) (Revision 1488824) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488824 Files : * /hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestSemanticAnalysis.java Failing on TestSemanticAnalysis.testAddReplaceCols in trunk --- Key: HIVE-4636 URL: https://issues.apache.org/jira/browse/HIVE-4636 Project: Hive Issue Type: Test Components: Tests Affects Versions: 0.12.0 Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.12.0 Attachments: HIVE-4636.D11013.1.patch Seemed regression from HIVE-4475. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4610) HCatalog checkstyle violation after HIVE-4578
[ https://issues.apache.org/jira/browse/HIVE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674008#comment-13674008 ] Hudson commented on HIVE-4610: -- Integrated in Hive-trunk-h0.21 #2126 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2126/]) HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock Noland via Ashutosh Chauhan) (Revision 1488825) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488825 Files : * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/default.res * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/windows.res HCatalog checkstyle violation after HIVE-4578 - Key: HIVE-4610 URL: https://issues.apache.org/jira/browse/HIVE-4610 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.12.0 Attachments: HIVE-4610-0.patch {noformat} checkstyle: [echo] hcatalog [checkstyle] Running Checkstyle 5.5 on 413 files [checkstyle] /home/brock/workspaces/hive-apache/hive/hcatalog/src/test/e2e/hcatalog/resource/default.res:1: Missing a header - not enough lines in file. [checkstyle] /home/brock/workspaces/hive-apache/hive/hcatalog/src/test/e2e/hcatalog/resource/windows.res:1: Missing a header - not enough lines in file. [for] hcatalog: The following error occurred while executing this line: [for] /home/brock/workspaces/hive-apache/hive/build.xml:310: The following error occurred while executing this line: [for] /home/brock/workspaces/hive-apache/hive/hcatalog/build.xml:109: The following error occurred while executing this line: [for] /home/brock/workspaces/hive-apache/hive/hcatalog/build-support/ant/checkstyle.xml:32: Got 2 errors and 0 warnings. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3846) alter view rename NPEs with authorization on.
[ https://issues.apache.org/jira/browse/HIVE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674017#comment-13674017 ] Hudson commented on HIVE-3846: -- Integrated in Hive-trunk-h0.21 #2126 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2126/]) HIVE-3846 : alter view rename NPEs with authorization on. (Teddy Choi via Ashutosh Chauhan) (Revision 1489009) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489009 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java * /hive/trunk/ql/src/test/queries/clientpositive/authorization_8.q * /hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out * /hive/trunk/ql/src/test/results/clientpositive/alter_view_rename.q.out * /hive/trunk/ql/src/test/results/clientpositive/authorization_8.q.out alter view rename NPEs with authorization on. - Key: HIVE-3846 URL: https://issues.apache.org/jira/browse/HIVE-3846 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.10.0, 0.11.0 Reporter: Ashutosh Chauhan Assignee: Teddy Choi Fix For: 0.12.0 Attachments: HIVE-3846.1.patch.txt, HIVE-3846.2.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[ https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674011#comment-13674011 ] Hudson commented on HIVE-4403: -- Integrated in Hive-trunk-h0.21 #2126 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2126/]) HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters (Chu Tong via Ashutosh Chauhan) (Revision 1489008) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489008 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Mark Grover Assignee: Chu Tong Fix For: 0.12.0 Attachments: HIVE-4403.patch, HIVE-4403.patch While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. This was on a pseudo distributed cluster. FWIW, I didn't see this happen on a fully-distributed cluster. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4562) HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar
[ https://issues.apache.org/jira/browse/HIVE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674013#comment-13674013 ] Hudson commented on HIVE-4562: -- Integrated in Hive-trunk-h0.21 #2126 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2126/]) HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan) (Revision 1488744) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488744 Files : * /hive/trunk/ql/build.xml HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar Key: HIVE-4562 URL: https://issues.apache.org/jira/browse/HIVE-4562 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.10.0, 0.11.0 Reporter: caofangkun Assignee: caofangkun Priority: Minor Fix For: 0.12.0 Attachments: HIVE-4562-1.patch, HIVE-4562-2.patch Some jars of Hive are required not only by the client but also the server (every Hadoop slave), though we could use 'add jar' command to add all the jars in dis-cache , but in common way ,we may add these jars in $HADOOP_HOME/lib/ of every salve of the Hadoop Cluster, and need restart all the tasktrackers . For example: When using hive stats, If we use mysql as tmp stats db ,every salve of the Hadoop Cluster should contain mysql-connector-java-.jar in $HADOOP_HOME/lib/ And for column stats In all slaves $HADOOP_HOME/lib/ should contain: jackson-core-asl-1.8.8.jar jackson-jaxrs-1.8.8.jar jackson-mapper-asl-1.8.8.jar jackson-xc-1.8.8.jar These jars should be separated from other common client-side-jars . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4510) HS2 doesn't nest exceptions properly (fun debug times)
[ https://issues.apache.org/jira/browse/HIVE-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674012#comment-13674012 ] Hudson commented on HIVE-4510: -- Integrated in Hive-trunk-h0.21 #2126 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2126/]) HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) (Thejas Nair via Ashutosh Chauhan) (Revision 1488740) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488740 Files : * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java * /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java * /hive/trunk/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java HS2 doesn't nest exceptions properly (fun debug times) -- Key: HIVE-4510 URL: https://issues.apache.org/jira/browse/HIVE-4510 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Gunther Hagleitner Assignee: Thejas M Nair Fix For: 0.12.0 Attachments: HIVE-4510.1.patch, HIVE-4510.2.patch In SQLOperation.java lines 97 + 113 for instance, we catch errors and throw a new HiveSQLException, but we don't wrap the original exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4489) beeline always return the same error message twice
[ https://issues.apache.org/jira/browse/HIVE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674015#comment-13674015 ] Hudson commented on HIVE-4489: -- Integrated in Hive-trunk-h0.21 #2126 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2126/]) HIVE-4489 : beeline always return the same error message twice (Chaoyu Tang via Ashutosh Chauhan) (Revision 1488741) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488741 Files : * /hive/trunk/beeline/src/java/org/apache/hive/beeline/Commands.java beeline always return the same error message twice -- Key: HIVE-4489 URL: https://issues.apache.org/jira/browse/HIVE-4489 Project: Hive Issue Type: Bug Components: Clients Affects Versions: 0.10.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Priority: Minor Labels: newbie Fix For: 0.12.0 Attachments: HIVE-4489.patch Original Estimate: 0h Remaining Estimate: 0h Beeline always returns the same error message twice. for example, if I try to create a table a2 which already exists, it prints out two exact same messages and it is not quite user friendly. {code} beeline !connect jdbc:hive2://localhost:1 scott tiger org.apache.hive.jdbc.HiveDriver Connecting to jdbc:hive2://localhost:1 Connected to: Hive (version 0.10.0) Driver: Hive (version 0.10.0-cdh4.2.1) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://localhost:1 create table a2 (value int); Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1) Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira