[jira] [Updated] (HIVE-4645) Stat information like numFiles and totalSize is not correct when sub-directory is exists

2013-06-03 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4645:
--

Attachment: HIVE-4645.D11037.1.patch

navis requested code review of HIVE-4645 [jira] Stat information like numFiles 
and totalSize is not correct when sub-directory is exists.

Reviewers: JIRA

HIVE-4645 Stat information like numFiles and totalSize is not correct when 
sub-directory is exists

The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's 
size of parent directory, not sum of file size.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11037

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java
  ql/src/test/results/clientpositive/infer_bucket_sort_list_bucket.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/26361/

To: JIRA, navis


 Stat information like numFiles and totalSize is not correct when 
 sub-directory is exists
 

 Key: HIVE-4645
 URL: https://issues.apache.org/jira/browse/HIVE-4645
 Project: Hive
  Issue Type: Test
  Components: Statistics
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-4645.D11037.1.patch


 The test infer_bucket_sort_list_bucket.q returns 4096 as totalSize but it's 
 size of parent directory, not sum of file size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4620: MR temp directory conflicts in case of parallel execution mode

2013-06-03 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11464/
---

(Updated June 3, 2013, 6:17 a.m.)


Review request for hive, Ashutosh Chauhan and Navis Ryu.


Changes
---

Updated patch per review comments
 - renamed taskID to taskRunnerID
 - removed extra call to set the pre-thread runner id. It's already handled in 
the thread local's overloaded initialValue().


Description
---

MR temp directory conflicts in case of parallel execution mode MR temp 
directory conflicts in Hive parallel execution mode.
Patch includes adding a per thread task counter to the MR scratch directory 
path set by hive.


This addresses bug HIVE-4620.
https://issues.apache.org/jira/browse/HIVE-4620


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Context.java 6466275 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java 56c2be6 

Diff: https://reviews.apache.org/r/11464/diff/


Testing
---

Manual testing, full unit test run.


Thanks,

Prasad Mujumdar



[jira] [Updated] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-06-03 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4620:
--

Attachment: HIVE-4620-3.patch

[~navis] Thanks for the comments. 
The original review request on https://reviews.apache.org/r/11464/ is updated 
with the new patch.

 MR temp directory conflicts in case of parallel execution mode
 --

 Key: HIVE-4620
 URL: https://issues.apache.org/jira/browse/HIVE-4620
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch


 In parallel query execution mode, all the parallel running task ends up 
 sharing the same temp/scratch directory. This could lead to file conflicts 
 and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4646) skewjoin.q is failing in hadoop2

2013-06-03 Thread Navis (JIRA)
Navis created HIVE-4646:
---

 Summary: skewjoin.q is failing in hadoop2
 Key: HIVE-4646
 URL: https://issues.apache.org/jira/browse/HIVE-4646
 Project: Hive
  Issue Type: Test
  Components: Query Processor
Reporter: Navis
Assignee: Navis


https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception 
instead of returning null for not-existing path. But skew resolver depends on 
old behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4646) skewjoin.q is failing in hadoop2

2013-06-03 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4646:
--

Attachment: HIVE-4646.D11043.1.patch

navis requested code review of HIVE-4646 [jira] skewjoin.q is failing in 
hadoop2.

Reviewers: JIRA

HIVE-4646 skewjoin.q is failing in hadoop2

https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception 
instead of returning null for not-existing path. But skew resolver depends on 
old behavior.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11043

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/26367/

To: JIRA, navis


 skewjoin.q is failing in hadoop2
 

 Key: HIVE-4646
 URL: https://issues.apache.org/jira/browse/HIVE-4646
 Project: Hive
  Issue Type: Test
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4646.D11043.1.patch


 https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception 
 instead of returning null for not-existing path. But skew resolver depends on 
 old behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3949) Some test failures in hadoop 23

2013-06-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13672883#comment-13672883
 ] 

Navis commented on HIVE-3949:
-

I'm looking into tests in TestCliDriver. Currently, 
{noformat}
archive_excludeHadoop20.q
archive_multi.q
auto_join14.q   : update result (changed default)
combine2.q
ctas_colname.q  : non-deterministic
groupby_grouping_sets4.q: non-deterministic
infer_bucket_sort_list_bucket.q : HIVE-4645
input12.q   : update result (added input hook)
input39.q   : update result (added input hook)
join32_lessSize.q   : non-deterministic
join_1to1.q
join_vc.q   : HIVE-4626
list_bucket_query_oneskew_1.q   : non-deterministic
list_bucket_query_oneskew_2.q   : non-deterministic
list_bucket_query_oneskew_3.q   : non-deterministic
multi_insert_lateral_view.q : non-deterministic
orc_diff_part_cols.q: non-deterministic
ptf_npath.q
recursive_dir.q : update result (added input hook)
sample_islocalmode_hook.q   : update result (added input hook)
skewjoin.q  : HIVE-4646
skewjoin_union_remove_1.q   : update result (seemed not applied HIVE-948)
skewjoin_union_remove_2.q   : update result (seemed not applied HIVE-948)
stats_partscan_1.q  
truncate_column.q   : non-deterministic
truncate_column_merge.q : non-deterministic
udaf_percentile_approx.q
{noformat}

 Some test failures in hadoop 23
 ---

 Key: HIVE-3949
 URL: https://issues.apache.org/jira/browse/HIVE-3949
 Project: Hive
  Issue Type: Bug
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu

 This is follow up on hive-3873.
 We have fixed some test failures in 3873 and a few other jira issues.
 We will use this jira to track the rest failures: 
 https://builds.apache.org/job/Hive-trunk-hadoop2/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2615) CTAS with literal NULL creates VOID type

2013-06-03 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-2615:
---

Assignee: Zhuoluo (Clark) Yang

 CTAS with literal NULL creates VOID type
 

 Key: HIVE-2615
 URL: https://issues.apache.org/jira/browse/HIVE-2615
 Project: Hive
  Issue Type: Bug
Reporter: David Phillips
Assignee: Zhuoluo (Clark) Yang

 Create the table with a column that always contains NULL:
 {quote}
 hive create table bad as select 1 x, null z from dual; 
 {quote}
 Because there's no type, Hive gives it the VOID type:
 {quote}
 hive describe bad;
 OK
 x int 
 z void
 {quote}
 This seems weird, because AFAIK, there is no normal way to create a column of 
 type VOID.  The problem is that the table can't be queried:
 {quote}
 hive select * from bad;
 OK
 Failed with exception java.io.IOException:java.lang.RuntimeException: 
 Internal error: no LazyObject for VOID
 {quote}
 Worse, even if you don't select that field, the query fails at runtime:
 {quote}
 hive select x from bad;
 ...
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4546: Hive CLI leaves behind the per session resource directory on non-interactive invocation

2013-06-03 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11083/
---

(Updated June 3, 2013, 8:29 a.m.)


Review request for hive, Owen O'Malley and Gunther Hagleitner.


Changes
---

Thanks for the review comments.
Updated patch with better error handling for CliDriver.run()

Regarding session id, its not much of a readable format it is (userid + vmName 
+ timestamp) + proposed counter. We can still run into edge conditions with   
multiple hive CLIs or multiple hive server (eg. for HA purpose) on different 
node. Using UUID as handle takes care such cases. 


Description
---

Hive CLI leaves behind the per session resource directory on non-interactive 
invocation. The patch includes executing session state close() at the end of 
non-interactive invocation.
Also changed the session id format to be a UUID. This is avoid possible 
resource directory path conflict when there are multiple session HiveServer2 
from same user at same time.


This addresses bug HIVE-4546.
https://issues.apache.org/jira/browse/HIVE-4546


Diffs (updated)
-

  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 4239392 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 8e6e24a 

Diff: https://reviews.apache.org/r/11083/diff/


Testing
---


Thanks,

Prasad Mujumdar



[jira] [Updated] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation

2013-06-03 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4546:
--

Status: Patch Available  (was: Open)

Thanks Ashutosh!
Responded to review comments and Updated patch.

 Hive CLI leaves behind the per session resource directory on non-interactive 
 invocation
 ---

 Key: HIVE-4546
 URL: https://issues.apache.org/jira/browse/HIVE-4546
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch


 As part of HIVE-4505, the resource directory is set to 
 /tmp/${hive.session.id}_resources and suppose to be removed at the end. The 
 CLI fails to remove it when invoked using -f or -e (non-interactive mode)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation

2013-06-03 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-4546:
--

Attachment: HIVE-4546-2.patch

 Hive CLI leaves behind the per session resource directory on non-interactive 
 invocation
 ---

 Key: HIVE-4546
 URL: https://issues.apache.org/jira/browse/HIVE-4546
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch


 As part of HIVE-4505, the resource directory is set to 
 /tmp/${hive.session.id}_resources and suppose to be removed at the end. The 
 CLI fails to remove it when invoked using -f or -e (non-interactive mode)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4647) RetryingHMSHandler logs too many error messages

2013-06-03 Thread Navis (JIRA)
Navis created HIVE-4647:
---

 Summary: RetryingHMSHandler logs too many error messages
 Key: HIVE-4647
 URL: https://issues.apache.org/jira/browse/HIVE-4647
 Project: Hive
  Issue Type: Improvement
Reporter: Navis
Assignee: Navis
Priority: Trivial


NoSuchObjectException on invocation of methods like getTable/getPartition need 
not to be logged because it might be normal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


0.9.1 branch.

2013-06-03 Thread ur lops
Could someone point me to hive 0.9.1 branch? Thanks in advance.
Regards


error in running the hive test cases

2013-06-03 Thread ur lops
Hi,
 When I run the hive test case, I keep getting the following error:
 [echo] Project: serde
[javac] Compiling 36 source files to
/home/john/dev/hive-0.9.0-Intel/src/build/serde/test/classes
[javac] TestAvroSerdeUtils.java:24: cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: package org.apache.hadoop.hdfs
[javac] import org.apache.hadoop.hdfs.MiniDFSCluster;
[javac]  ^
[javac] TestAvroSerdeUtils.java:184: cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: class
org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
[javac] MiniDFSCluster miniDfs = null;
[javac] ^
[javac] TestAvroSerdeUtils.java:187: cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: class
org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
[javac]   miniDfs = new MiniDFSCluster(new Configuration(), 1,
true, null);
[javac] ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

I am building hive 0.9  and running the test using
ant package test.

Could you help.
Thanks


Re: 0.9.1 branch.

2013-06-03 Thread Owen O'Malley
The 0.9.x branch is branch-0.9. There isn't a 0.9.1 release and it is very 
unlikely. 

http://svn.apache.org/repos/asf/hive/


-- Owen

On Jun 3, 2013, at 11:51, ur lops urlop...@gmail.com wrote:

 Could someone point me to hive 0.9.1 branch? Thanks in advance.
 Regards


Re: 0.9.1 branch.

2013-06-03 Thread ur lops
Thanks Owen for the quick response. I am looking for hive-895, which
claims that it is merged in 0.9.1.
( https://issues.apache.org/jira/browse/HIVE-895 )
 How to get that particular commit?

Regards

On Mon, Jun 3, 2013 at 2:59 AM, Owen O'Malley owen.omal...@gmail.com wrote:
 The 0.9.x branch is branch-0.9. There isn't a 0.9.1 release and it is very 
 unlikely.

 http://svn.apache.org/repos/asf/hive/


 -- Owen

 On Jun 3, 2013, at 11:51, ur lops urlop...@gmail.com wrote:

 Could someone point me to hive 0.9.1 branch? Thanks in advance.
 Regards


Re: 0.9.1 branch.

2013-06-03 Thread Owen O'Malley
https://github.com/apache/hive/commit/e42ec89b31ae056e51d8db25d4ecc1a8a51212e0


On Mon, Jun 3, 2013 at 12:10 PM, ur lops urlop...@gmail.com wrote:

 Thanks Owen for the quick response. I am looking for hive-895, which
 claims that it is merged in 0.9.1.
 ( https://issues.apache.org/jira/browse/HIVE-895 )
  How to get that particular commit?

 Regards

 On Mon, Jun 3, 2013 at 2:59 AM, Owen O'Malley owen.omal...@gmail.com
 wrote:
  The 0.9.x branch is branch-0.9. There isn't a 0.9.1 release and it is
 very unlikely.
 
  http://svn.apache.org/repos/asf/hive/
 
 
  -- Owen
 
  On Jun 3, 2013, at 11:51, ur lops urlop...@gmail.com wrote:
 
  Could someone point me to hive 0.9.1 branch? Thanks in advance.
  Regards



[jira] [Updated] (HIVE-4612) Vectorized aggregates do not emit proper rows in presence of GROUP BY

2013-06-03 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4612:
---

Attachment: HIVE-4612.1.patch.txt

Add support for all types

 Vectorized aggregates do not emit proper rows in presence of GROUP BY
 -

 Key: HIVE-4612
 URL: https://issues.apache.org/jira/browse/HIVE-4612
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
 Fix For: vectorization-branch

 Attachments: HIVE-4612.0.patch.txt, HIVE-4612.1.patch.txt


 I discovered this while testing the fix for HIVE-4451 and HIVE-4452. The VGBy 
 is emitting appropriate number of rows, but the row-mode ReduceSinkOperatoir 
 only logs one row and the final result is incomplete. Investigating. Related 
 to HIVE-4599.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4612 Fix vector aggregates int type key output

2013-06-03 Thread Remus Rusanu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11427/
---

(Updated June 3, 2013, 2:08 p.m.)


Review request for hive.


Changes
---

Added support for all current supported types (tinyint, smallint, int, bigint, 
boolean, timestamp, string, float, double)


Description
---

The VectorHashKeyValue output for int key type was broken, the M/R expects the 
type emitted to match the type reduced. By using a BinaryWriter with a 
LongWritable instead of a IntWritable the value was effectively corrupted.


This addresses bug HIVE-4612.
https://issues.apache.org/jira/browse/HIVE-4612


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/VectorHashKeyWrapperBatch.java 
cd57151 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/TimestampUtils.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
91366dd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorReduceSinkOperator.java 
f61fcb6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
6bb5618 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java 
ffd7ef2 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java 
aeff313 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFAvgDouble.java
 54102a4 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFAvgLong.java
 8c6844b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdPopDouble.java
 a4084b0 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdPopLong.java
 28fdb36 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdSampDouble.java
 4fa52ff 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdSampLong.java
 551ae8a 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFSumDouble.java
 a2e8fb3 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFSumLong.java
 71b2e3d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarPopDouble.java
 2dfbfa3 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarPopLong.java
 de4811d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarSampDouble.java
 5a21f44 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarSampLong.java
 7b88c4f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFAvg.txt
 d85346d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFVar.txt
 daae57b 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/FakeVectorRowBatchFromObjectIterables.java
 6824ee7 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorGroupByOperator.java
 6fc230f 

Diff: https://reviews.apache.org/r/11427/diff/


Testing
---

manual test query


Thanks,

Remus Rusanu



[jira] [Updated] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4403:
---

Affects Version/s: 0.11.0

 Running Hive queries on Yarn (MR2) gives warnings related to overriding final 
 parameters
 

 Key: HIVE-4403
 URL: https://issues.apache.org/jira/browse/HIVE-4403
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0, 0.11.0
Reporter: Mark Grover
Assignee: Chu Tong
 Fix For: 0.12.0

 Attachments: HIVE-4403.patch, HIVE-4403.patch


 While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings 
 related to overriding final parameters in job.conf. This was on a pseudo 
 distributed cluster. FWIW, I didn't see this happen on a fully-distributed 
 cluster. Perhaps, Hive's job.conf is overriding some final parameters it 
 shouldn't.
 Here is what the warnings looked like:
 {code}
 2013-04-19 14:20:32,304 WARN  [main] conf.Configuration 
 (Configuration.java:loadProperty(2032)) - 
 file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 2013-04-19 14:20:32,367 WARN  [main] conf.Configuration 
 (Configuration.java:loadProperty(2032)) - 
 file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 {code}
 To reproduce, run a query like:
 {code}
 CREATE TABLE u_data (
   userid INT,
   movieid INT,
   rating INT,
   unixtime STRING)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 STORED AS TEXTFILE;
 {code}
 Load some data into u_data, here is some sample data:
 https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data
 Run a simple query on that data (on YARN/MR2)
 {code}
 INSERT OVERWRITE DIRECTORY '/tmp/count'
 SELECT COUNT(1) FROM u_data
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4403.


   Resolution: Fixed
Fix Version/s: 0.12.0

Committed to trunk. Thanks, Chu!

 Running Hive queries on Yarn (MR2) gives warnings related to overriding final 
 parameters
 

 Key: HIVE-4403
 URL: https://issues.apache.org/jira/browse/HIVE-4403
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Mark Grover
Assignee: Chu Tong
 Fix For: 0.12.0

 Attachments: HIVE-4403.patch, HIVE-4403.patch


 While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings 
 related to overriding final parameters in job.conf. This was on a pseudo 
 distributed cluster. FWIW, I didn't see this happen on a fully-distributed 
 cluster. Perhaps, Hive's job.conf is overriding some final parameters it 
 shouldn't.
 Here is what the warnings looked like:
 {code}
 2013-04-19 14:20:32,304 WARN  [main] conf.Configuration 
 (Configuration.java:loadProperty(2032)) - 
 file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 2013-04-19 14:20:32,367 WARN  [main] conf.Configuration 
 (Configuration.java:loadProperty(2032)) - 
 file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 {code}
 To reproduce, run a query like:
 {code}
 CREATE TABLE u_data (
   userid INT,
   movieid INT,
   rating INT,
   unixtime STRING)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 STORED AS TEXTFILE;
 {code}
 Load some data into u_data, here is some sample data:
 https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data
 Run a simple query on that data (on YARN/MR2)
 {code}
 INSERT OVERWRITE DIRECTORY '/tmp/count'
 SELECT COUNT(1) FROM u_data
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters

2013-06-03 Thread Chu Tong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673166#comment-13673166
 ] 

Chu Tong commented on HIVE-4403:


no problem, thank you for reviewing it [~ashutoshgupt...@gmail.com]

 Running Hive queries on Yarn (MR2) gives warnings related to overriding final 
 parameters
 

 Key: HIVE-4403
 URL: https://issues.apache.org/jira/browse/HIVE-4403
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0, 0.11.0
Reporter: Mark Grover
Assignee: Chu Tong
 Fix For: 0.12.0

 Attachments: HIVE-4403.patch, HIVE-4403.patch


 While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings 
 related to overriding final parameters in job.conf. This was on a pseudo 
 distributed cluster. FWIW, I didn't see this happen on a fully-distributed 
 cluster. Perhaps, Hive's job.conf is overriding some final parameters it 
 shouldn't.
 Here is what the warnings looked like:
 {code}
 2013-04-19 14:20:32,304 WARN  [main] conf.Configuration 
 (Configuration.java:loadProperty(2032)) - 
 file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 2013-04-19 14:20:32,367 WARN  [main] conf.Configuration 
 (Configuration.java:loadProperty(2032)) - 
 file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 {code}
 To reproduce, run a query like:
 {code}
 CREATE TABLE u_data (
   userid INT,
   movieid INT,
   rating INT,
   unixtime STRING)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 STORED AS TEXTFILE;
 {code}
 Load some data into u_data, here is some sample data:
 https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data
 Run a simple query on that data (on YARN/MR2)
 {code}
 INSERT OVERWRITE DIRECTORY '/tmp/count'
 SELECT COUNT(1) FROM u_data
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3846) alter view rename NPEs with authorization on.

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3846:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Teddy!

 alter view rename NPEs with authorization on.
 -

 Key: HIVE-3846
 URL: https://issues.apache.org/jira/browse/HIVE-3846
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.10.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Teddy Choi
 Fix For: 0.12.0

 Attachments: HIVE-3846.1.patch.txt, HIVE-3846.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4615) Invalid column names allowed when created dynamically by a SerDe

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4615:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gabriel!

 Invalid column names allowed when created dynamically by a SerDe
 

 Key: HIVE-4615
 URL: https://issues.apache.org/jira/browse/HIVE-4615
 Project: Hive
  Issue Type: Bug
Reporter: Gabriel Reid
Assignee: Gabriel Reid
 Fix For: 0.12.0

 Attachments: HIVE-4615.1.patch.txt


 When a SerDe creates columns dynamically during table creation, there is no 
 checking done on the validity of the created column names. This means that 
 it's possible to create a table that contains columns that can't be queried, 
 and will lead to issues when trying to query the created table.
 The same column name validation should be performed for dynamically-created 
 columns as for other column names.
 This behavior can be easily tested using the TestSerDe, and including a 
 column name that includes an invalid identifier character (e.g. a period) in 
 the list of columns to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-06-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673184#comment-13673184
 ] 

Navis commented on HIVE-4620:
-

+1, running test.

 MR temp directory conflicts in case of parallel execution mode
 --

 Key: HIVE-4620
 URL: https://issues.apache.org/jira/browse/HIVE-4620
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch


 In parallel query execution mode, all the parallel running task ends up 
 sharing the same temp/scratch directory. This could lead to file conflicts 
 and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 2125 - Still Failing

2013-06-03 Thread Apache Jenkins Server
Changes for Build #2095

Changes for Build #2096

Changes for Build #2097
[cws] HIVE-4530. Enforce minmum ant version required in build script (Arup 
Malakar via cws)

[omalley] Preparing RELEASE_NOTES for Hive 0.11.0rc2.


Changes for Build #2098
[omalley] Update release notes for 0.11.0rc2

[omalley] HIVE-4527 Fix eclipse project template (Carl Steinbach via omalley)

[omalley] HIVE-4505 Hive can't load transforms with remote scripts. (Prasad 
Majumdar and Gunther Hagleitner
via omalley)

[omalley] HIVE-4498 TestBeeLineWithArgs.testPositiveScriptFile fails (Thejas 
Nair via omalley)


Changes for Build #2099

Changes for Build #2100

Changes for Build #2101

Changes for Build #2102

Changes for Build #2103
[daijy] PIG-2955: Fix bunch of Pig e2e tests on Windows


Changes for Build #2104
[daijy] PIG-3069: Native Windows Compatibility for Pig E2E Tests and Harness


Changes for Build #2105
[omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions 
(Gunther 
Hagleitner via omalley)

[omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther 
Hagleitner via
omalley)


Changes for Build #2106

Changes for Build #2107
[omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there 
are many 
partitions (Gopal V via omalley)


Changes for Build #2108

Changes for Build #2109

Changes for Build #2110

Changes for Build #2111
[omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther 
Hagleitner
via omalley)

[omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther 
Hagleitner via
omalley)


Changes for Build #2112

Changes for Build #2113
[gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates)


Changes for Build #2114
[gates] HIVE-4581 HCat e2e tests broken by changes to Hive's describe table 
formatting (gates)


Changes for Build #2115

Changes for Build #2116
[navis] JDBC2: HiveDriver should not throw RuntimeException when passed an 
invalid URL (Richard Ding via Navis)


Changes for Build #2117

Changes for Build #2118

Changes for Build #2119

Changes for Build #2120

Changes for Build #2121
[navis] HIVE-4572 ColumnPruner cannot preserve RS key columns corresponding to 
un-selected join keys in columnExprMap (Yin Huai via Navis)

[navis] HIVE-4540 JOIN-GRP BY-DISTINCT fails with NPE when 
mapjoin.mapreduce=true (Gunther Hagleitner via Navis)


Changes for Build #2122

Changes for Build #2123

Changes for Build #2124
[gates] HIVE-4543 Broken link in HCat doc (Reader and Writer Interfaces) (Lefty 
Leverenz via gates)


Changes for Build #2125
[daijy] PIG-3337: Fix remaining Window e2e tests




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2125)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2125/ to 
view the results.

[jira] [Created] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2

2013-06-03 Thread Hari Sekhon (JIRA)
Hari Sekhon created HIVE-4648:
-

 Summary: Add ability to set hadoop conf overrides in JDBC for 
HiveServer2
 Key: HIVE-4648
 URL: https://issues.apache.org/jira/browse/HIVE-4648
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.10.0
Reporter: Hari Sekhon


It's possible in BeeLine to specify set command overides of hadoop config 
variables, but I haven't seen any example code of how to do this in JDBC with 
HiveServer2.

We need an ability to specify hadoop conf overrides on a per session basis or 
even half way through the session. See this Hive ticket for some background:

https://issues.apache.org/jira/browse/HIVE-4644

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2

2013-06-03 Thread Hari Sekhon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-4648:
--

Component/s: HiveServer2

 Add ability to set hadoop conf overrides in JDBC for HiveServer2
 

 Key: HIVE-4648
 URL: https://issues.apache.org/jira/browse/HIVE-4648
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, JDBC
Affects Versions: 0.10.0
Reporter: Hari Sekhon

 It's possible in BeeLine to specify set command overides of hadoop config 
 variables, but I haven't seen any example code of how to do this in JDBC with 
 HiveServer2.
 We need an ability to specify hadoop conf overrides on a per session basis or 
 even half way through the session. See this Hive ticket for some background:
 https://issues.apache.org/jira/browse/HIVE-4644

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4585) Remove unused MR Temp file localization from Tasks

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673324#comment-13673324
 ] 

Ashutosh Chauhan commented on HIVE-4585:


I think it makes sense to remove this piece of code. Executing query locally 
(instead of on cluster) isn't the common use case for Hive. So, unless anyone 
really is interested in optimizing that code path, its better to get rid of it 
to lessen our technical debt.
+1

 Remove unused MR Temp file localization from Tasks
 --

 Key: HIVE-4585
 URL: https://issues.apache.org/jira/browse/HIVE-4585
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-4585.1.patch


 HIVE-1408 introduced code that is currently commented out (i.e.: dead code), 
 with a comment saying needs further development (HIVE-1484). It's been like 
 this for close to 3 years. 
 I suggest removing the code until such time that someone picks up that work. 
 At that time they can decide if they want to use this code or pursue another 
 route (FS shim?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Fwd: error in running the hive test cases

2013-06-03 Thread ur lops
Hi,
 When I run the hive test case, I keep getting the following error:
 [echo] Project: serde
[javac] Compiling 36 source files to
/home/john/dev/hive-0.9.0-Intel/src/build/serde/test/classes
[javac] TestAvroSerdeUtils.java:24: cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: package org.apache.hadoop.hdfs
[javac] import org.apache.hadoop.hdfs.MiniDFSCluster;
[javac]  ^
[javac] TestAvroSerdeUtils.java:184: cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: class
org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
[javac] MiniDFSCluster miniDfs = null;
[javac] ^
[javac] TestAvroSerdeUtils.java:187: cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: class
org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils
[javac]   miniDfs = new MiniDFSCluster(new Configuration(), 1,
true, null);
[javac] ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

I am building hive 0.9  and running the test using
ant package test.
can someone give me a pointer, which jar is missing from classpath and
how to resolve it.

Thanks


[jira] [Commented] (HIVE-4418) TestNegativeCliDriver failure message if cmd succeeds is misleading

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673348#comment-13673348
 ] 

Ashutosh Chauhan commented on HIVE-4418:


+1

 TestNegativeCliDriver failure message if cmd succeeds is misleading
 ---

 Key: HIVE-4418
 URL: https://issues.apache.org/jira/browse/HIVE-4418
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.10.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4418.1.patch


 If the .q test ends up succeeding (exit code == 0), then the test failure 
 message is misleading.
 From the error it seems as if the command actually failed -
 {code}
 [junit] junit.framework.AssertionFailedError: Client Execution failed 
 with error code = 0
 [junit] See build/ql/tmp/hive.log, or try ant test ... 
 -Dtest.silent=false to get more logs.
 [junit] at junit.framework.Assert.fail(Assert.java:47)
 [junit] at 
 org.apache.hadoop.hive.cli.TestNegativeCliDriver.runTest(TestNegativeCliDriver.java:121)
 [junit] at 
 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_desc_tab(TestNegativeCliDriver.java:102)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4547: A complex create view statement fails with new Antlr 3.4

2013-06-03 Thread Shreepadma Venugopalan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11084/#review21332
---

Ship it!


LGTM. +1 (non-binding).

- Shreepadma Venugopalan


On May 13, 2013, 9:06 a.m., Prasad Mujumdar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11084/
 ---
 
 (Updated May 13, 2013, 9:06 a.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 The parser has a translation map where its possible to replace all the text 
 with the appropriate escaped version in case of a view creation. This holds 
 all individual translations and where they apply in the view definition.
 The newer antlr version seems to be more restrictive and throws assertion if 
 there's an overlaps in these escape positions. The original patch for antlr 
 upgrade added a check to take care of some of the simpler overlap cases found 
 by unit tests. There are few more scenarios like the one in the customer case 
 which are not covered.
 The patch includes Traverse the list of translation in a loop and look for 
 all the possible overlaps.
 
 
 This addresses bug HIVE-4547.
 https://issues.apache.org/jira/browse/HIVE-4547
 
 
 Diffs
 -
 
   data/files/v1.txt PRE-CREATION 
   data/files/v2.txt PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/UnparseTranslator.java ec2c088 
   ql/src/test/queries/clientpositive/view_cast.q PRE-CREATION 
   ql/src/test/results/clientpositive/view_cast.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/11084/diff/
 
 
 Testing
 ---
 
 Ran full test suite. 
 Added new test.
 
 
 Thanks,
 
 Prasad Mujumdar
 




[jira] [Commented] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2

2013-06-03 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673367#comment-13673367
 ] 

Shreepadma Venugopalan commented on HIVE-4648:
--

[~harisekhon]: It is possible to set and unset config variables through JDBC 
that can be set/unset through the command line. To do so, you'd need to do an 
execute statement with set config.var = value. To set the scratch dir, you 
can do the following in JDBC,

{noformat}
statement.execute(set hive.exec.scratchdir = /tmp/mydir);
{noformat}

Note that this property is set for the particular JDBC connection. 

 Add ability to set hadoop conf overrides in JDBC for HiveServer2
 

 Key: HIVE-4648
 URL: https://issues.apache.org/jira/browse/HIVE-4648
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, JDBC
Affects Versions: 0.10.0
Reporter: Hari Sekhon

 It's possible in BeeLine to specify set command overides of hadoop config 
 variables, but I haven't seen any example code of how to do this in JDBC with 
 HiveServer2.
 We need an ability to specify hadoop conf overrides on a per session basis or 
 even half way through the session. See this Hive ticket for some background:
 https://issues.apache.org/jira/browse/HIVE-4644

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2

2013-06-03 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673369#comment-13673369
 ] 

Shreepadma Venugopalan commented on HIVE-4648:
--

Please note that setting hive.exec.scratchdir is just an example of doing sets 
through JDBC.

 Add ability to set hadoop conf overrides in JDBC for HiveServer2
 

 Key: HIVE-4648
 URL: https://issues.apache.org/jira/browse/HIVE-4648
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, JDBC
Affects Versions: 0.10.0
Reporter: Hari Sekhon

 It's possible in BeeLine to specify set command overides of hadoop config 
 variables, but I haven't seen any example code of how to do this in JDBC with 
 HiveServer2.
 We need an ability to specify hadoop conf overrides on a per session basis or 
 even half way through the session. See this Hive ticket for some background:
 https://issues.apache.org/jira/browse/HIVE-4644

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2206) add a new optimizer for query correlation discovery and optimization

2013-06-03 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-2206:
---

Affects Version/s: (was: 0.10.0)
   0.12.0
   Status: In Progress  (was: Patch Available)

 add a new optimizer for query correlation discovery and optimization
 

 Key: HIVE-2206
 URL: https://issues.apache.org/jira/browse/HIVE-2206
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: He Yongqiang
Assignee: Yin Huai
 Attachments: HIVE-2206.10-r1384442.patch.txt, 
 HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, 
 HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, 
 HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt, 
 HIVE-2206.17-r1404933.patch.txt, HIVE-2206.18-r1407720.patch.txt, 
 HIVE-2206.19-r1410581.patch.txt, HIVE-2206.1.patch.txt, 
 HIVE-2206.20-r1434012.patch.txt, HIVE-2206.2.patch.txt, 
 HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, 
 HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, 
 HIVE-2206.8.r1224646.patch.txt, HIVE-2206.8-r1237253.patch.txt, 
 testQueries.2.q, YSmartPatchForHive.patch


 This issue proposes a new logical optimizer called Correlation Optimizer, 
 which is used to merge correlated MapReduce jobs (MR jobs) into a single MR 
 job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/). The 
 paper and slides of YSmart are linked at the bottom.
 Since Hive translates queries in a sentence by sentence fashion, for every 
 operation which may need to shuffle the data (e.g. join and aggregation 
 operations), Hive will generate a MapReduce job for that operation. However, 
 for those operations which may need to shuffle the data, they may involve 
 correlations explained below and thus can be executed in a single MR job.
 # Input Correlation: Multiple MR jobs have input correlation (IC) if their 
 input relation sets are not disjoint;
 # Transit Correlation: Multiple MR jobs have transit correlation (TC) if they 
 have not only input correlation, but also the same partition key;
 # Job Flow Correlation: An MR has job flow correlation (JFC) with one of its 
 child nodes if it has the same partition key as that child node.
 The current implementation of correlation optimizer only detect correlations 
 among MR jobs for reduce-side join operators and reduce-side aggregation 
 operators (not map only aggregation). A query will be optimized if it 
 satisfies following conditions.
 # There exists a MR job for reduce-side join operator or reduce side 
 aggregation operator which have JFC with all of its parents MR jobs (TCs will 
 be also exploited if JFC exists);
 # All input tables of those correlated MR job are original input tables (not 
 intermediate tables generated by sub-queries); and 
 # No self join is involved in those correlated MR jobs.
 Correlation optimizer is implemented as a logical optimizer. The main reasons 
 are that it only needs to manipulate the query plan tree and it can leverage 
 the existing component on generating MR jobs.
 Current implementation can serve as a framework for correlation related 
 optimizations. I think that it is better than adding individual optimizers. 
 There are several work that can be done in future to improve this optimizer. 
 Here are three examples.
 # Support queries only involve TC;
 # Support queries in which input tables of correlated MR jobs involves 
 intermediate tables; and 
 # Optimize queries involving self join. 
 References:
 Paper and presentation of YSmart.
 Paper: 
 http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf
 Slides: http://sdrv.ms/UpwJJc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4451) Add support for string column type vector aggregates: COUNT, MIN and MAX

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4451:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch. Thanks, Remus!

 Add support for string column type vector aggregates: COUNT, MIN and MAX
 

 Key: HIVE-4451
 URL: https://issues.apache.org/jira/browse/HIVE-4451
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
 Fix For: vectorization-branch

 Attachments: HIVE-4451.0.patch.txt


 Extend the vector aggregates operations to support string types.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4592) fix failure to set output isNull to true and other NULL propagation issues; update arithmetic tests

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4592:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch. Thanks, Eric!

 fix failure to set output isNull to true and other NULL propagation issues; 
 update arithmetic tests
 ---

 Key: HIVE-4592
 URL: https://issues.apache.org/jira/browse/HIVE-4592
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Eric Hanson
Assignee: Eric Hanson
 Fix For: vectorization-branch

 Attachments: HIVE-4592.1.patch, HIVE-4592.3.patch, HIVE-4592.4.patch


 ColumnArithmeticColumn.txt should set the output column's noNulls flag to 
 true if neither input column has nulls, but it does not do that. This can 
 lead to wrong results is the noNulls was set to false in a previous use of 
 the batch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4612) Vectorized aggregates do not emit proper rows in presence of GROUP BY

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673403#comment-13673403
 ] 

Ashutosh Chauhan commented on HIVE-4612:


Not specific to this patch, but VectorHashKeyWrapperBatch.java should be in 
vector package (instead of exec). Can you file a follow-up jira to move that 
file?

 Vectorized aggregates do not emit proper rows in presence of GROUP BY
 -

 Key: HIVE-4612
 URL: https://issues.apache.org/jira/browse/HIVE-4612
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
 Fix For: vectorization-branch

 Attachments: HIVE-4612.0.patch.txt, HIVE-4612.1.patch.txt


 I discovered this while testing the fix for HIVE-4451 and HIVE-4452. The VGBy 
 is emitting appropriate number of rows, but the row-mode ReduceSinkOperatoir 
 only logs one row and the final result is incomplete. Investigating. Related 
 to HIVE-4599.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4612) Vectorized aggregates do not emit proper rows in presence of GROUP BY

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4612:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to branch. Thanks, Remus!

 Vectorized aggregates do not emit proper rows in presence of GROUP BY
 -

 Key: HIVE-4612
 URL: https://issues.apache.org/jira/browse/HIVE-4612
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: vectorization-branch
Reporter: Remus Rusanu
Assignee: Remus Rusanu
 Fix For: vectorization-branch

 Attachments: HIVE-4612.0.patch.txt, HIVE-4612.1.patch.txt


 I discovered this while testing the fix for HIVE-4451 and HIVE-4452. The VGBy 
 is emitting appropriate number of rows, but the row-mode ReduceSinkOperatoir 
 only logs one row and the final result is incomplete. Investigating. Related 
 to HIVE-4599.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4608) Vectorized UDFs for Timestamp in nanoseconds

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673413#comment-13673413
 ] 

Ashutosh Chauhan commented on HIVE-4608:


[~gopalv] Patch is not applying cleanly on top of svn branch with patch (tried 
both -p0 and -p1). Can you regenerate so that it applies cleanly on svn 
vectorization branch?

 Vectorized UDFs for Timestamp in nanoseconds
 

 Key: HIVE-4608
 URL: https://issues.apache.org/jira/browse/HIVE-4608
 Project: Hive
  Issue Type: Sub-task
Affects Versions: vectorization-branch
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
  Labels: vectorization
 Attachments: 
 0001-Vectorized-UDFs-for-timestamp-functions-which-accept.patch, 
 0002-Update-patch-to-the-review-comments-in-https-reviews.patch


 Vectorized UDFs for timestamp functions which accept long vectors
 VectorUDFYearLong   
 VectorUDFMonthLong
 VectorUDFWeekOfYearLong   
 VectorUDFDayOfMonthLong
 VectorUDFHourLong   
 VectorUDFMinuteLong
 VectorUDFSecondLong   
 VectorUDFUnixTimeStampLong 
 and tests for them against their non-vectorized implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2

2013-06-03 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-4648.
--

Resolution: Not A Problem

 Add ability to set hadoop conf overrides in JDBC for HiveServer2
 

 Key: HIVE-4648
 URL: https://issues.apache.org/jira/browse/HIVE-4648
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, JDBC
Affects Versions: 0.10.0
Reporter: Hari Sekhon

 It's possible in BeeLine to specify set command overides of hadoop config 
 variables, but I haven't seen any example code of how to do this in JDBC with 
 HiveServer2.
 We need an ability to specify hadoop conf overrides on a per session basis or 
 even half way through the session. See this Hive ticket for some background:
 https://issues.apache.org/jira/browse/HIVE-4644

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4648) Add ability to set hadoop conf overrides in JDBC for HiveServer2

2013-06-03 Thread Hari Sekhon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673473#comment-13673473
 ] 

Hari Sekhon commented on HIVE-4648:
---

Thanks. Is there a particular doc that I missed?

 Add ability to set hadoop conf overrides in JDBC for HiveServer2
 

 Key: HIVE-4648
 URL: https://issues.apache.org/jira/browse/HIVE-4648
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, JDBC
Affects Versions: 0.10.0
Reporter: Hari Sekhon

 It's possible in BeeLine to specify set command overides of hadoop config 
 variables, but I haven't seen any example code of how to do this in JDBC with 
 HiveServer2.
 We need an ability to specify hadoop conf overrides on a per session basis or 
 even half way through the session. See this Hive ticket for some background:
 https://issues.apache.org/jira/browse/HIVE-4644

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4629) HS2 should support an API to retrieve query logs

2013-06-03 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673508#comment-13673508
 ] 

Shreepadma Venugopalan commented on HIVE-4629:
--

[~cwsteinbach]: Can you look at this? Thanks!

 HS2 should support an API to retrieve query logs
 

 Key: HIVE-4629
 URL: https://issues.apache.org/jira/browse/HIVE-4629
 Project: Hive
  Issue Type: Sub-task
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan

 HiveServer2 should support an API to retrieve query logs. This is 
 particularly relevant because HiveServer2 supports async execution but 
 doesn't provide a way to report progress. Providing an API to retrieve query 
 logs will help report progress to the client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-4513 - disable hivehistory logs by default

2013-06-03 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11029/#review21352
---



data/conf/hive-site.xml
https://reviews.apache.org/r/11029/#comment44263

Is there a reason for this to be set to true for tests? Unless there is, we 
should set config in tests to the default values, since we should test default 
configs.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44264

doesn't read right. I guess you wanted 
... statistics into a file.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44266

This is existing comment which doesnt read right. But since we are doing 
major surgery on HiveHistory, it will be good to update to make it more 
sensible.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44268

I think word job is not required in this comment.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44269

I think query is a better word than job here.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44270

Better worded as
Called at the end of query.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44271

Again use of word job is confusing, we shall use query here as well.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44272

Incorrect comment.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
https://reviews.apache.org/r/11029/#comment44274

Function name is IdtoTable, but comment says table to id. One of this needs 
to be corrected.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44275

Similar comment as in HiveHistory.java



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44277

Should this be hive.ql.exec.HiveHistoryImpl to avoid confusion?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44278

and instead of an ?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44280

In case of incorrect config, should this throw an exception instead of 
silent return, otherwise there will be errors later when something is tried to 
be written in history file.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44281

Same comment as above.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44283

This should be static class variable, otherwise nextInt() will return same 
value for each invocation.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44284

Instead of / we shall use File.Seprator



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44287

Consider using File.createNewFile here.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44288

Use  System.getProperty(line.separator) instead of \n



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
https://reviews.apache.org/r/11029/#comment44289

start of query ?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java
https://reviews.apache.org/r/11029/#comment44291

Missing apache header



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java
https://reviews.apache.org/r/11029/#comment44292

HiveHistoryViewer.class


- Ashutosh Chauhan


On May 13, 2013, 10:12 p.m., Thejas Nair wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11029/
 ---
 
 (Updated May 13, 2013, 10:12 p.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.
 

[jira] [Created] (HIVE-4649) Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation

2013-06-03 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HIVE-4649:
--

 Summary: Unit test failure in 
TestColumnScalarOperationVectorExpressionEvaluation 
 Key: HIVE-4649
 URL: https://issues.apache.org/jira/browse/HIVE-4649
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


The test fails due to bug in ColumnCompareScalar.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4513) disable hivehistory logs by default

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673559#comment-13673559
 ] 

Ashutosh Chauhan commented on HIVE-4513:


[~thejas] Left some comments on RB.

 disable hivehistory logs by default
 ---

 Key: HIVE-4513
 URL: https://issues.apache.org/jira/browse/HIVE-4513
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Logging
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, 
 HIVE-4513.4.patch


 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4513) disable hivehistory logs by default

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4513:
---

Status: Open  (was: Patch Available)

Canceling patch for now.

 disable hivehistory logs by default
 ---

 Key: HIVE-4513
 URL: https://issues.apache.org/jira/browse/HIVE-4513
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Logging
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, 
 HIVE-4513.4.patch


 HiveHistory log files (hive_job_log_hive_*.txt files) store information about 
 hive query such as query string, plan , counters and MR job progress 
 information.
 There is no mechanism to delete these files and as a result they get 
 accumulated over time, using up lot of disk space. 
 I don't think this is used by most people, so I think it would better to turn 
 this off by default. Jobtracker logs already capture most of this 
 information, though it is not as structured as history logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4343) HS2 with kerberos- local task for map join fails

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4343:
---

Status: Open  (was: Patch Available)

This will be redundant if we get in HIVE-4470

 HS2 with kerberos- local task for map join fails
 

 Key: HIVE-4343
 URL: https://issues.apache.org/jira/browse/HIVE-4343
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-4343.1.patch


 With hive server2 configured with kerberos security, when a (map) join query 
 is run, it results in failure with GSSException: No valid credentials 
 provided 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4171) Current database in metastore.Hive is not consistent with SessionState

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4171:
---

Status: Open  (was: Patch Available)

Canceling patch for now.

 Current database in metastore.Hive is not consistent with SessionState
 --

 Key: HIVE-4171
 URL: https://issues.apache.org/jira/browse/HIVE-4171
 Project: Hive
  Issue Type: Bug
  Components: CLI
Reporter: Navis
Assignee: Thejas M Nair
  Labels: HiveServer2
 Attachments: HIVE-4171.3.patch, HIVE-4171.4.patch, 
 HIVE-4171.D9399.1.patch, HIVE-4171.D9399.2.patch


 metastore.Hive is thread local instance, which can have different status with 
 SessionState. Currently the only status in metastore.Hive is database name in 
 use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4611) SMB joins fail based on bigtable selection policy.

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673573#comment-13673573
 ] 

Ashutosh Chauhan commented on HIVE-4611:


[~vikram.dixit] Is this ready for review? If so, can you create a phabricator 
or RB link and mark it patch available.

 SMB joins fail based on bigtable selection policy.
 --

 Key: HIVE-4611
 URL: https://issues.apache.org/jira/browse/HIVE-4611
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.11.1

 Attachments: HIVE-4611.patch


 The default setting for 
 hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the 
 big table as the one with largest average partition size. However, this can 
 result in a query failing because this policy conflicts with the big table 
 candidates chosen for outer joins. This policy should just be a tie breaker 
 and not have the ultimate say in the choice of tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4502) NPE - subquery smb joins fails

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4502:
---

Assignee: Vikram Dixit K  (was: Navis)

 NPE - subquery smb joins fails
 --

 Key: HIVE-4502
 URL: https://issues.apache.org/jira/browse/HIVE-4502
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, 
 smb_mapjoin_25.q


 Found this issue while running some SMB joins. Attaching test case that 
 causes this error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4502) NPE - subquery smb joins fails

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673578#comment-13673578
 ] 

Ashutosh Chauhan commented on HIVE-4502:


[~navis] Would you like to take a look at Vikram's patch? I think if we can 
retain SMBJoin instead of converting them to reduce-side join, thats better.

 NPE - subquery smb joins fails
 --

 Key: HIVE-4502
 URL: https://issues.apache.org/jira/browse/HIVE-4502
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, 
 smb_mapjoin_25.q


 Found this issue while running some SMB joins. Attaching test case that 
 causes this error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4649) Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation

2013-06-03 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4649:
---

Attachment: HIVE-4649.1.patch

Attached patch fixes the issue.

 Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation 
 -

 Key: HIVE-4649
 URL: https://issues.apache.org/jira/browse/HIVE-4649
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-4649.1.patch


 The test fails due to bug in ColumnCompareScalar.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3876) call resetValid instead of ensureCapacity in the constructor of BytesRefArrayWritable

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673590#comment-13673590
 ] 

Ashutosh Chauhan commented on HIVE-3876:


[~yhuai] Are you still working on this? Or shall we close this as Not A 
Problem?

 call resetValid instead of ensureCapacity in the constructor of 
 BytesRefArrayWritable
 -

 Key: HIVE-3876
 URL: https://issues.apache.org/jira/browse/HIVE-3876
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.10.0
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Minor
 Attachments: HIVE-3876.1.patch.txt


 In the constructor of BytesRefArrayWritable, ensureCapacity(capacity) is 
 called, but valid has not been adjusted accordingly. After a new 
 BytesRefArrayWritable has been created with a initial capacity of x, if 
 resetValid() has not been called explicitly, the size returned is still 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673595#comment-13673595
 ] 

Ashutosh Chauhan commented on HIVE-4435:


Sorry for the delay. +1 Will commit if tests pass.

 Column stats: Distinct value estimator should use hash functions that are 
 pairwise independent
 --

 Key: HIVE-4435
 URL: https://issues.apache.org/jira/browse/HIVE-4435
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: chart_1(1).png, HIVE-4435.1.patch


 The current implementation of Flajolet-Martin estimator to estimate the 
 number of distinct values doesn't use hash functions that are pairwise 
 independent. This is problematic because the input values don't distribute 
 uniformly. When run on large TPC-H data sets, this leads to a huge 
 discrepancy for primary key columns. Primary key columns are typically a 
 monotonically increasing sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-4649) Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4649.


   Resolution: Fixed
Fix Version/s: vectorization-branch

Committed to branch. Thanks, Jitendra!

 Unit test failure in TestColumnScalarOperationVectorExpressionEvaluation 
 -

 Key: HIVE-4649
 URL: https://issues.apache.org/jira/browse/HIVE-4649
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: vectorization-branch

 Attachments: HIVE-4649.1.patch


 The test fails due to bug in ColumnCompareScalar.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent

2013-06-03 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673599#comment-13673599
 ] 

Shreepadma Venugopalan commented on HIVE-4435:
--

Thanks Ashutosh!

 Column stats: Distinct value estimator should use hash functions that are 
 pairwise independent
 --

 Key: HIVE-4435
 URL: https://issues.apache.org/jira/browse/HIVE-4435
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: chart_1(1).png, HIVE-4435.1.patch


 The current implementation of Flajolet-Martin estimator to estimate the 
 number of distinct values doesn't use hash functions that are pairwise 
 independent. This is problematic because the input values don't distribute 
 uniformly. When run on large TPC-H data sets, this leads to a huge 
 discrepancy for primary key columns. Primary key columns are typically a 
 monotonically increasing sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4535) hive build fails with hadoop 0.20

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673610#comment-13673610
 ] 

Hudson commented on HIVE-4535:
--

Integrated in Hive-trunk-hadoop2 #223 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/223/])
HIVE-4535 : hive build fails with hadoop 0.20 (Thejas Nair via Ashutosh 
Chauhan) (Revision 1488739)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488739
Files : 
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java


 hive build fails with hadoop 0.20
 -

 Key: HIVE-4535
 URL: https://issues.apache.org/jira/browse/HIVE-4535
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.12.0

 Attachments: HIVE-4535.1.patch, HIVE-4535.2.patch


 ant  package -Dhadoop.mr.rev=20
 leads to - 
 {code}
 [javac] 
 /Users/thejas/hive_thejas_git/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java:382:
  cannot find symbol
 [javac] symbol  : method 
 join(java.lang.String,java.util.Listjava.lang.String)
 [javac] location: class org.apache.hadoop.util.StringUtils
 [javac]   StringUtils.join(,, incompatibleCols)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4562) HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673609#comment-13673609
 ] 

Hudson commented on HIVE-4562:
--

Integrated in Hive-trunk-hadoop2 #223 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/223/])
HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars should 
be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan) (Revision 
1488744)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488744
Files : 
* /hive/trunk/ql/build.xml


 HIVE-3393 brought in Jackson library,and these four jars should be packed 
 into hive-exec.jar
 

 Key: HIVE-4562
 URL: https://issues.apache.org/jira/browse/HIVE-4562
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0, 0.11.0
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4562-1.patch, HIVE-4562-2.patch


 Some jars of Hive are required not only by the client but also the server 
 (every Hadoop slave),
 though we could use 'add jar' command to add all the jars in dis-cache ,
 but in common way ,we may add these jars in $HADOOP_HOME/lib/  of every salve 
 of the Hadoop Cluster,
 and need restart all the tasktrackers .
 For example:
 When using hive stats, If we use mysql as tmp stats db ,every salve of the 
 Hadoop Cluster should contain 
 mysql-connector-java-.jar in $HADOOP_HOME/lib/ 
 And for column stats 
 In all slaves $HADOOP_HOME/lib/ should contain:
 jackson-core-asl-1.8.8.jar
 jackson-jaxrs-1.8.8.jar
 jackson-mapper-asl-1.8.8.jar
 jackson-xc-1.8.8.jar
 These jars should be separated  from other common client-side-jars .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4510) HS2 doesn't nest exceptions properly (fun debug times)

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673608#comment-13673608
 ] 

Hudson commented on HIVE-4510:
--

Integrated in Hive-trunk-hadoop2 #223 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/223/])
HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) (Thejas 
Nair via Ashutosh Chauhan) (Revision 1488740)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488740
Files : 
* /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java
* /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
* /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java
* /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java


 HS2 doesn't nest exceptions properly (fun debug times)
 --

 Key: HIVE-4510
 URL: https://issues.apache.org/jira/browse/HIVE-4510
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Reporter: Gunther Hagleitner
Assignee: Thejas M Nair
 Fix For: 0.12.0

 Attachments: HIVE-4510.1.patch, HIVE-4510.2.patch


 In SQLOperation.java lines 97 + 113 for instance, we catch errors and throw a 
 new HiveSQLException, but we don't wrap the original exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673612#comment-13673612
 ] 

Ashutosh Chauhan commented on HIVE-4561:


[~shreepadma] Since you wrote this originally, would you like to review this as 
well ?

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4610) HCatalog checkstyle violation after HIVE-4578

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673605#comment-13673605
 ] 

Hudson commented on HIVE-4610:
--

Integrated in Hive-trunk-hadoop2 #223 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/223/])
HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock Noland via 
Ashutosh Chauhan) (Revision 1488825)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488825
Files : 
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/default.res
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/windows.res


 HCatalog checkstyle violation after HIVE-4578
 -

 Key: HIVE-4610
 URL: https://issues.apache.org/jira/browse/HIVE-4610
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.12.0

 Attachments: HIVE-4610-0.patch


 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 413 files
 [checkstyle] 
 /home/brock/workspaces/hive-apache/hive/hcatalog/src/test/e2e/hcatalog/resource/default.res:1:
  Missing a header - not enough lines in file.
 [checkstyle] 
 /home/brock/workspaces/hive-apache/hive/hcatalog/src/test/e2e/hcatalog/resource/windows.res:1:
  Missing a header - not enough lines in file.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /home/brock/workspaces/hive-apache/hive/build.xml:310: The 
 following error occurred while executing this line:
   [for] /home/brock/workspaces/hive-apache/hive/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /home/brock/workspaces/hive-apache/hive/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 2 errors and 0 warnings.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4636) Failing on TestSemanticAnalysis.testAddReplaceCols in trunk

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673607#comment-13673607
 ] 

Hudson commented on HIVE-4636:
--

Integrated in Hive-trunk-hadoop2 #223 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/223/])
HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in trunk 
(Navis via Ashutosh Chauhan) (Revision 1488824)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488824
Files : 
* 
/hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestSemanticAnalysis.java


 Failing on TestSemanticAnalysis.testAddReplaceCols in trunk
 ---

 Key: HIVE-4636
 URL: https://issues.apache.org/jira/browse/HIVE-4636
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 0.12.0
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-4636.D11013.1.patch


 Seemed regression from HIVE-4475. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-03 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673616#comment-13673616
 ] 

Shreepadma Venugopalan commented on HIVE-4561:
--

[~ashutoshc]: Sure, I'll take a look at this today.

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4547) A complex create view statement fails with new Antlr 3.4

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673619#comment-13673619
 ] 

Ashutosh Chauhan commented on HIVE-4547:


[~thiruvel] Would you like to review this, since you wrote this piece 
originally.

 A complex create view statement fails with new Antlr 3.4
 

 Key: HIVE-4547
 URL: https://issues.apache.org/jira/browse/HIVE-4547
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4547-1.patch, HIVE-4547-repro.tar


 A complex create view statement with CAST in join condition fails with 
 IllegalArgumentException error. This is exposed by the Antlr 3.4 upgrade 
 (HIVE-2439). The same statement works fine with Hive 0.9

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-3876) call resetValid instead of ensureCapacity in the constructor of BytesRefArrayWritable

2013-06-03 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai resolved HIVE-3876.


Resolution: Not A Problem

Sorry for not looking at it for a long time. I just took a look at the code. 
BytesRefArrayWritable is used by first ensureCapacity and then set valid in 
resetValid or set. If we use resetValid in the constructor, we can get 
those elements which are not valid, which should not be allowed. Let's close it 
as Not A Problem.

 call resetValid instead of ensureCapacity in the constructor of 
 BytesRefArrayWritable
 -

 Key: HIVE-3876
 URL: https://issues.apache.org/jira/browse/HIVE-3876
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.10.0
Reporter: Yin Huai
Assignee: Yin Huai
Priority: Minor
 Attachments: HIVE-3876.1.patch.txt


 In the constructor of BytesRefArrayWritable, ensureCapacity(capacity) is 
 called, but valid has not been adjusted accordingly. After a new 
 BytesRefArrayWritable has been created with a initial capacity of x, if 
 resetValid() has not been called explicitly, the size returned is still 0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4245) Implement numeric dictionaries in ORC

2013-06-03 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673643#comment-13673643
 ] 

Owen O'Malley commented on HIVE-4245:
-

Pam, Have you had a chance to work on this?

 Implement numeric dictionaries in ORC
 -

 Key: HIVE-4245
 URL: https://issues.apache.org/jira/browse/HIVE-4245
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Pamela Vagata

 For many applications, especially in de-normalized data, there is a lot of 
 redundancy in the numeric columns. Therefore, it would make sense to 
 adaptively use dictionary encodings for numeric columns in addition to string 
 columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4546) Hive CLI leaves behind the per session resource directory on non-interactive invocation

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673644#comment-13673644
 ] 

Ashutosh Chauhan commented on HIVE-4546:


+1

 Hive CLI leaves behind the per session resource directory on non-interactive 
 invocation
 ---

 Key: HIVE-4546
 URL: https://issues.apache.org/jira/browse/HIVE-4546
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-4546-1.patch, HIVE-4546-2.patch


 As part of HIVE-4505, the resource directory is set to 
 /tmp/${hive.session.id}_resources and suppose to be removed at the end. The 
 CLI fails to remove it when invoked using -f or -e (non-interactive mode)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces

2013-06-03 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11335/
---

(Updated June 3, 2013, 10:18 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

1. Added data input file to the new test case that was missing from previous 
patch.
2. Please note that review board doesn't show the added data file name 
correctly because of the space in it. However, applying the patch to the code 
base has no issue.


Description
---

Patch includes fix and new test case.


This addresses bug HIVE-4554.
https://issues.apache.org/jira/browse/HIVE-4554


Diffs (updated)
-

  data/files/person PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java bd8d252 
  ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/11335/diff/


Testing
---


Thanks,

Xuefu Zhang



[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces

2013-06-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-4554:
--

Attachment: HIVE-4554.patch.3

HIVE-4554.patch.3 is the same as HIVE-4554.patch.2 except that it includs the 
data input file for the new test case which was missing.

All test case passed.

RB request is here: https://reviews.apache.org/r/11335/

 Failed to create a table from existing file if file path has spaces
 ---

 Key: HIVE-4554
 URL: https://issues.apache.org/jira/browse/HIVE-4554
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, 
 HIVE-4554.patch.3


 To reproduce the problem,
 1. Create a table, say, person_age (name STRING, age INT).
 2. Create a file whose name has a space in it, say, data set.txt.
 3. Try to load the date in the file to the table.
 The following error can be seen in the console:
 hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
 Loading data to table default.person_age
 Failed with exception Wrong file format. Please check the file's format.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask
 Note: the error message is confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4554) Failed to create a table from existing file if file path has spaces

2013-06-03 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-4554:
--

Status: Patch Available  (was: Open)

 Failed to create a table from existing file if file path has spaces
 ---

 Key: HIVE-4554
 URL: https://issues.apache.org/jira/browse/HIVE-4554
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, 
 HIVE-4554.patch.3


 To reproduce the problem,
 1. Create a table, say, person_age (name STRING, age INT).
 2. Create a file whose name has a space in it, say, data set.txt.
 3. Try to load the date in the file to the table.
 The following error can be seen in the console:
 hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
 Loading data to table default.person_age
 Failed with exception Wrong file format. Please check the file's format.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask
 Note: the error message is confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #165

2013-06-03 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/165/

--
[...truncated 42329 lines...]
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2013-06-03 15:40:56,227 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] Execution completed successfully
[junit] Mapred Local Task Succeeded . Convert the Join into MapJoin
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-06-03_15-40-53_015_7476949543543173382/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201306031540_2137645541.txt
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] Table default.testhivedrivertable stats: [num_partitions: 0, 
num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0]
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-06-03_15-40-57_529_6053755246777448667/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-06-03_15-40-57_529_6053755246777448667/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201306031540_79631070.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable

[jira] [Commented] (HIVE-4547) A complex create view statement fails with new Antlr 3.4

2013-06-03 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673739#comment-13673739
 ] 

Thiruvel Thirumoolan commented on HIVE-4547:


Sure, will take a look.

 A complex create view statement fails with new Antlr 3.4
 

 Key: HIVE-4547
 URL: https://issues.apache.org/jira/browse/HIVE-4547
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4547-1.patch, HIVE-4547-repro.tar


 A complex create view statement with CAST in join condition fails with 
 IllegalArgumentException error. This is exposed by the Antlr 3.4 upgrade 
 (HIVE-2439). The same statement works fine with Hive 0.9

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces

2013-06-03 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11335/#review21366
---


Patch looks good, apart from one comment.


ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
https://reviews.apache.org/r/11335/#comment44301

Apart from this change, all other changes are contained within if(isLocal) 
block. Because of this it seems its possible it might be triggered for 
non-local paths as well. Can you test it for hdfs:// path which has spaces. If 
its easy, it will be good to add it in test, else manual test is fine as well.


- Ashutosh Chauhan


On June 3, 2013, 10:18 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11335/
 ---
 
 (Updated June 3, 2013, 10:18 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Description
 ---
 
 Patch includes fix and new test case.
 
 
 This addresses bug HIVE-4554.
 https://issues.apache.org/jira/browse/HIVE-4554
 
 
 Diffs
 -
 
   data/files/person PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
 bd8d252 
   ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/11335/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Xuefu Zhang
 




[jira] [Commented] (HIVE-4554) Failed to create a table from existing file if file path has spaces

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673759#comment-13673759
 ] 

Ashutosh Chauhan commented on HIVE-4554:


Comment on RB.

 Failed to create a table from existing file if file path has spaces
 ---

 Key: HIVE-4554
 URL: https://issues.apache.org/jira/browse/HIVE-4554
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-4554.patch, HIVE-4554.patch.1, HIVE-4554.patch.2, 
 HIVE-4554.patch.3


 To reproduce the problem,
 1. Create a table, say, person_age (name STRING, age INT).
 2. Create a file whose name has a space in it, say, data set.txt.
 3. Try to load the date in the file to the table.
 The following error can be seen in the console:
 hive LOAD DATA INPATH '/home/xzhang/temp/data set.txt' INTO TABLE person_age;
 Loading data to table default.person_age
 Failed with exception Wrong file format. Please check the file's format.
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask
 Note: the error message is confusing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4566) NullPointerException if typeinfo and nativesql commands are executed at beeline before a DB connection is established

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673785#comment-13673785
 ] 

Ashutosh Chauhan commented on HIVE-4566:


As in original description, I like the idea of printing No current connection 
in such scenarios but I don't think current patch prints it. Can you modify 
your test to make sure that indeed gets printed?

 NullPointerException if typeinfo and nativesql commands are executed at 
 beeline before a DB connection is established
 -

 Key: HIVE-4566
 URL: https://issues.apache.org/jira/browse/HIVE-4566
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-4566.patch


 Before a DB connection is established, executing a command such as typeinfo 
 and nativesql results an NPE shown at the console:
 beeline !typeinfo
 java.lang.NullPointerException
 beeline !nativesql
 java.lang.NullPointerException
 Instead, a message, such as No current connection should be given, as in 
 case of some other commands, such as dropall.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-06-03 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4620:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk, thanks Prasad!

 MR temp directory conflicts in case of parallel execution mode
 --

 Key: HIVE-4620
 URL: https://issues.apache.org/jira/browse/HIVE-4620
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch


 In parallel query execution mode, all the parallel running task ends up 
 sharing the same temp/scratch directory. This could lead to file conflicts 
 and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4620) MR temp directory conflicts in case of parallel execution mode

2013-06-03 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673853#comment-13673853
 ] 

Prasad Mujumdar commented on HIVE-4620:
---

Thanks Navis!

 MR temp directory conflicts in case of parallel execution mode
 --

 Key: HIVE-4620
 URL: https://issues.apache.org/jira/browse/HIVE-4620
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.12.0

 Attachments: HIVE-4620-1.patch, HIVE-4620-2.patch, HIVE-4620-3.patch


 In parallel query execution mode, all the parallel running task ends up 
 sharing the same temp/scratch directory. This could lead to file conflicts 
 and temp files getting deleted before the job completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4502) NPE - subquery smb joins fails

2013-06-03 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673856#comment-13673856
 ] 

Navis commented on HIVE-4502:
-

I've not converted any SMBJoins to RS-joins and just changed creation order of 
those. The difference is that my patch adds a root task only when all of the 
join aliases are handled, which is contrary to trunk (add root whenever 
possible and remove if it's not afterwards). The patch I've attached seemed 
easier but it is just my call.

 NPE - subquery smb joins fails
 --

 Key: HIVE-4502
 URL: https://issues.apache.org/jira/browse/HIVE-4502
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, 
 smb_mapjoin_25.q


 Found this issue while running some SMB joins. Attaching test case that 
 causes this error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 , if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-03 Thread Shreepadma Venugopalan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11172/#review21391
---

Ship it!


Ship It!

- Shreepadma Venugopalan


On June 3, 2013, 4:46 a.m., Zhuoluo Yang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/11172/
 ---
 
 (Updated June 3, 2013, 4:46 a.m.)
 
 
 Review request for hive, Carl Steinbach, Carl Steinbach, Ashutosh Chauhan, 
 and fangkun cao.
 
 
 Description
 ---
 
 An initialization error.
 Make double and long initialize correctly.
 Would you review that and assign the issue to me?
 
 
 This addresses bug HIVE-4561.
 https://issues.apache.org/jira/browse/HIVE-4561
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java
  1488823 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/compute_stats_long.q.out
  1488823 
 
 Diff: https://reviews.apache.org/r/11172/diff/
 
 
 Testing
 ---
 
 ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_long.q
 ant test -Dtestcase=TestCliDriver -Dqfile=compute_stats_double.q
 
 done.
 
 
 Thanks,
 
 Zhuoluo Yang
 




[jira] [Commented] (HIVE-4561) Column stats : LOW_VALUE (or HIGH_VALUE) will always be 0.0000 ,if all the column values larger than 0.0 (or if all column values smaller than 0.0)

2013-06-03 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673887#comment-13673887
 ] 

Shreepadma Venugopalan commented on HIVE-4561:
--

LGTM! +1 (non-binding).

 Column stats :  LOW_VALUE (or HIGH_VALUE) will always be 0. ,if all the 
 column values larger than 0.0 (or if all column values smaller than 0.0)
 

 Key: HIVE-4561
 URL: https://issues.apache.org/jira/browse/HIVE-4561
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.12.0
Reporter: caofangkun
Assignee: Zhuoluo (Clark) Yang
 Attachments: HIVE-4561.1.patch, HIVE-4561.2.patch


 if all column values larger than 0.0  DOUBLE_LOW_VALUE always will be 0.0 
 or  if all column values less than 0.0,  DOUBLE_HIGH_VALUE will always be 
 hive (default) create table src_test (price double);
 hive (default) load data local inpath './test.txt' into table src_test;
 hive (default) select * from src_test;
 OK
 1.0
 2.0
 3.0
 Time taken: 0.313 seconds, Fetched: 3 row(s)
 hive (default) analyze table src_test compute statistics for columns price;
 mysql select * from TAB_COL_STATS \G;
  CS_ID: 16
DB_NAME: default
 TABLE_NAME: src_test
COLUMN_NAME: price
COLUMN_TYPE: double
 TBL_ID: 2586
 LONG_LOW_VALUE: 0
LONG_HIGH_VALUE: 0
   DOUBLE_LOW_VALUE: 0.   # Wrong Result ! Expected is 1.
  DOUBLE_HIGH_VALUE: 3.
  BIG_DECIMAL_LOW_VALUE: NULL
 BIG_DECIMAL_HIGH_VALUE: NULL
  NUM_NULLS: 0
  NUM_DISTINCTS: 1
AVG_COL_LEN: 0.
MAX_COL_LEN: 0
  NUM_TRUES: 0
 NUM_FALSES: 0
  LAST_ANALYZED: 1368596151
 2 rows in set (0.00 sec)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: ask for cube

2013-06-03 Thread 杨卓荦
Hi, Guowp

Maybe the following wiki helps you. Hive supports cube after Hive-0.10.0
https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup


Cheers,
Zhuoluo (Clark) Yang


2013/5/30 guowp gu...@asiainfo-linkage.com

 Sorry ,the cobe is “cube”, the mistake was hanppening by my careless



 发件人: guowp [mailto:gu...@asiainfo-linkage.com]
 发送时间: 2013年5月30日 11:43
 收件人: 'dev@hive.apache.org'
 主题: ask for cobe



 Hi sir, I am the developer of AsiainfoLinkage , the sql “select … …
 group by  cube (colum1,colum2),colum3” used by oracle,

 but in the hive,we only use the hsql “ select … …   group by
 colum1,colum2,colum3 with cube” 。

 I wan to know:

 1. Whether the hive will support the cube like “group by  cube (colum1,
 colum2),colum3”

 2. If the hive will support ,the plan ? which version will it work .



 Thanks  , With best  wishes .

   By Guowp , Asiainfo Linkage












[jira] [Updated] (HIVE-4032) Inserting data into Hive table from a query, when the query is a partitioned table and select * ,will generate a SemanticException

2013-06-03 Thread caofangkun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caofangkun updated HIVE-4032:
-

Assignee: caofangkun

 Inserting data into Hive table from a query, when the query is a partitioned 
 table and select * ,will generate a SemanticException
 --

 Key: HIVE-4032
 URL: https://issues.apache.org/jira/browse/HIVE-4032
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
 Environment: Apache Hadoop 0.19.1  + Apache Hive 0.10.0
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor
 Attachments: HIVE-4032-1.patch


 Inserting data into Hive table from a query , when the query is : select * 
 from a_partitioned_table, will throw a SemanticException .
 It seems that * contains the virtual partition columns.
 drop table if exists zr_test;
 create table if not exists zr_test (key string, value string) partitioned by 
 (dt string);
 drop table if exists zr_test_1;
 create table if not exists zr_test_1 (key string, value string) partitioned 
 by (dt string);
 --Query One 
 explain
 insert into table zr_test partition (dt='20130217') select key, value from 
 zr_test_1 where dt='20130217';
 --Query Two
 explain
 insert into table zr_test partition (dt='20130217') select * from zr_test_1 
 where dt='20130217';
 Ouery One works well, bug Query Two failed with the following information:
 FAILED: SemanticException [Error 10044]: Line 2:18 Cannot insert into target 
 table because column number/types are different ''20130217'': Table 
 insclause-0 has 2 columns, but query has 3 columns.
 p.s:
 Query Two works well on Apache Hadoop 0.20.1 + Hive 0.10.0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4502) NPE - subquery smb joins fails

2013-06-03 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4502:
-

Assignee: Navis  (was: Vikram Dixit K)

 NPE - subquery smb joins fails
 --

 Key: HIVE-4502
 URL: https://issues.apache.org/jira/browse/HIVE-4502
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Navis
 Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, 
 smb_mapjoin_25.q


 Found this issue while running some SMB joins. Attaching test case that 
 causes this error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4502) NPE - subquery smb joins fails

2013-06-03 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-4502:
-

Attachment: smb_mapjoin_25.q

 NPE - subquery smb joins fails
 --

 Key: HIVE-4502
 URL: https://issues.apache.org/jira/browse/HIVE-4502
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Navis
 Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, 
 smb_mapjoin_25.q, smb_mapjoin_25.q


 Found this issue while running some SMB joins. Attaching test case that 
 causes this error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4502) NPE - subquery smb joins fails

2013-06-03 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673946#comment-13673946
 ] 

Vikram Dixit K commented on HIVE-4502:
--

[~navis] I misread the results of the test case from your patch. I was going 
through your patch more meticulously and found that a few of the tests have 
different results. Particularly those in auto_sortmerge_join_6.q. The count 
results seem to have changed.

HIVE-3891 converts SMB joins to map-joins when possible. Although that seems 
orthogonal to this change, any idea as to why the join is still SMB?

Also attached a few more tests for this. The plans seem valid after applying 
your patch. I will continue to review the patch.

 NPE - subquery smb joins fails
 --

 Key: HIVE-4502
 URL: https://issues.apache.org/jira/browse/HIVE-4502
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Navis
 Attachments: HIVE-4502-1.patch, HIVE-4502.D10695.1.patch, 
 smb_mapjoin_25.q, smb_mapjoin_25.q


 Found this issue while running some SMB joins. Attaching test case that 
 causes this error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4650) Getting Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after upgrade to Hive-0.11.0.x from hive-0.10.0.x

2013-06-03 Thread Bruce Nelson (JIRA)
Bruce Nelson created HIVE-4650:
--

 Summary: Getting Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after 
upgrade to Hive-0.11.0.x from hive-0.10.0.x
 Key: HIVE-4650
 URL: https://issues.apache.org/jira/browse/HIVE-4650
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: HortonWorks 1.3 distro on x86_64 Centos 6 
Reporter: Bruce Nelson


working from a simple table in Hive 

hive desc cmnt
 ;
OK
x1  int None
x2  int None
x3  int None
x4  int None
y   double  None 

hive select * from cmnt;
OK
7   26  6   60  78.5
1   29  15  52  74.3
11  56  8   20  104.3
11  31  8   47  87.6
7   52  6   33  95.9
11  55  9   22  109.2
3   71  17  6   102.7
1   31  22  44  72.5
2   54  18  22  93.1
21  47  4   26  115.9
1   40  23  34  83.8
11  66  9   12  113.3
10  68  8   12  109.4

A query that joins and transforms against this table : 

select * from (select VAL001 x1,VAL002 x2,VAL003 x3,VAL004 x4,VAL005 y from ( 
select /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,(VAL002- mu2) * 
1/(sd2) VAL002,(VAL003- mu3) * 1/(sd3) VAL003,(VAL004- mu4) * 1/(sd4) 
VAL004,(VAL005- mu5) * 1/(sd5) VAL005 from ( select * from ( select x1 
VAL001,x2 VAL002,x3 VAL003,x4 VAL004,y VAL005 from cmnt ) obj1_3 ) v3 join 
(select count(*) c, avg(VAL001) mu1,avg(VAL002) mu2,avg(VAL003) mu3,avg(VAL004) 
mu4,avg(VAL005) mu5, stddev_pop(VAL001) sd1,stddev_pop(VAL002) 
sd2,stddev_pop(VAL003) sd3,stddev_pop(VAL004) sd4,stddev_pop(VAL005) sd5 from ( 
select * from ( select x1 VAL001,x2 VAL002,x3 VAL003,x4 VAL004,y VAL005 from 
cmnt ) obj1_3 ) v1) v2 ) obj1_7) obj1_6 ;

Generates during Stage-3 : 
setting HADOOP_USER_NAMEtest
Execution log at: /tmp/test/.log
2013-06-03 12:40:55 Starting to launch local task to process map join;  
maximum memory = 1065484288
2013-06-03 12:40:56 Processing rows:1   Hashtable size: 1   
Memory usage:   7175528   rate:   0.007
2013-06-03 12:40:56 Dump the hashtable into file: 
file:/tmp/test/hive_2013-06-03_00-40-21_708_6820064283161196136/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable
2013-06-03 12:40:56 Upload 1 File to: 
file:/tmp/test/hive_2013-06-03_00-40-21_708_6820064283161196136/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable
 File size: 334
2013-06-03 12:40:56 End of local task; Time Taken: 0.726 sec.
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
Mapred Local Task Succeeded . Convert the Join into MapJoin
Launching Job 2 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201306022123_0045, Tracking URL = 
http://sun1vm3:50030/jobdetails.jsp?jobid=job_201306022123_0045
Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job  -kill 
job_201306022123_0045
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2013-06-03 00:41:05,895 Stage-3 map = 0%,  reduce = 0%
2013-06-03 00:41:40,687 Stage-3 map = 100%,  reduce = 100%
Ended Job = job_201306022123_0045 with errors
Error during job, obtaining debugging information...
Job Tracking URL: 
http://sun1vm3:50030/jobdetails.jsp?jobid=job_201306022123_0045
Examining task ID: task_201306022123_0045_m_02 (and more) from job 
job_201306022123_0045

Task with the most failures(4): 
-
Task ID:
  task_201306022123_0045_m_00

URL:
  
http://sun1vm3:50030/taskdetails.jsp?jobid=job_201306022123_0045tipid=task_201306022123_0045_m_00
-
Diagnostic Messages for this Task:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: 

[jira] [Commented] (HIVE-4650) Getting Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after upgrade to Hive-0.11.0.x from hive-0.10.0.x

2013-06-03 Thread Bruce Nelson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673947#comment-13673947
 ] 

Bruce Nelson commented on HIVE-4650:


If hive.auto.convert.join = false is set then the all the query stages work 
OK. The same scenario worked OK in Hive-0.10.0.x and Hive-0.9.x with MapJoin 
working.

 Getting Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask on auto convert to MapJoin after 
 upgrade to Hive-0.11.0.x from hive-0.10.0.x
 --

 Key: HIVE-4650
 URL: https://issues.apache.org/jira/browse/HIVE-4650
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
 Environment: HortonWorks 1.3 distro on x86_64 Centos 6 
Reporter: Bruce Nelson

 working from a simple table in Hive 
 hive desc cmnt
  ;
 OK
 x1  int None
 x2  int None
 x3  int None
 x4  int None
 y   double  None 
 hive select * from cmnt;
 OK
 7   26  6   60  78.5
 1   29  15  52  74.3
 11  56  8   20  104.3
 11  31  8   47  87.6
 7   52  6   33  95.9
 11  55  9   22  109.2
 3   71  17  6   102.7
 1   31  22  44  72.5
 2   54  18  22  93.1
 21  47  4   26  115.9
 1   40  23  34  83.8
 11  66  9   12  113.3
 10  68  8   12  109.4
 A query that joins and transforms against this table : 
 select * from (select VAL001 x1,VAL002 x2,VAL003 x3,VAL004 x4,VAL005 y from ( 
 select /*+ mapjoin(v2) */ (VAL001- mu1) * 1/(sd1) VAL001,(VAL002- mu2) * 
 1/(sd2) VAL002,(VAL003- mu3) * 1/(sd3) VAL003,(VAL004- mu4) * 1/(sd4) 
 VAL004,(VAL005- mu5) * 1/(sd5) VAL005 from ( select * from ( select x1 
 VAL001,x2 VAL002,x3 VAL003,x4 VAL004,y VAL005 from cmnt ) obj1_3 ) v3 join 
 (select count(*) c, avg(VAL001) mu1,avg(VAL002) mu2,avg(VAL003) 
 mu3,avg(VAL004) mu4,avg(VAL005) mu5, stddev_pop(VAL001) 
 sd1,stddev_pop(VAL002) sd2,stddev_pop(VAL003) sd3,stddev_pop(VAL004) 
 sd4,stddev_pop(VAL005) sd5 from ( select * from ( select x1 VAL001,x2 
 VAL002,x3 VAL003,x4 VAL004,y VAL005 from cmnt ) obj1_3 ) v1) v2 ) obj1_7) 
 obj1_6 ;
 Generates during Stage-3 : 
 setting HADOOP_USER_NAMEtest
 Execution log at: /tmp/test/.log
 2013-06-03 12:40:55 Starting to launch local task to process map join;
   maximum memory = 1065484288
 2013-06-03 12:40:56 Processing rows:1   Hashtable size: 1 
   Memory usage:   7175528   rate:   0.007
 2013-06-03 12:40:56 Dump the hashtable into file: 
 file:/tmp/test/hive_2013-06-03_00-40-21_708_6820064283161196136/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable
 2013-06-03 12:40:56 Upload 1 File to: 
 file:/tmp/test/hive_2013-06-03_00-40-21_708_6820064283161196136/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable
  File size: 334
 2013-06-03 12:40:56 End of local task; Time Taken: 0.726 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 2 out of 2
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_201306022123_0045, Tracking URL = 
 http://sun1vm3:50030/jobdetails.jsp?jobid=job_201306022123_0045
 Kill Command = /usr/lib/hadoop/libexec/../bin/hadoop job  -kill 
 job_201306022123_0045
 Hadoop job information for Stage-3: number of mappers: 1; number of reducers:  0
 2013-06-03 00:41:05,895 Stage-3 map = 0%,  reduce = 0%
 2013-06-03 00:41:40,687 Stage-3 map = 100%,  reduce = 100%
 Ended Job = job_201306022123_0045 with errors
 Error during job, obtaining debugging information...
 Job Tracking URL: 
 http://sun1vm3:50030/jobdetails.jsp?jobid=job_201306022123_0045
 Examining task ID: task_201306022123_0045_m_02 (and more) from job 
 job_201306022123_0045
 Task with the most failures(4): 
 -
 Task ID:
   task_201306022123_0045_m_00
 URL:
   
 http://sun1vm3:50030/taskdetails.jsp?jobid=job_201306022123_0045tipid=task_201306022123_0045_m_00
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
 at 

[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent

2013-06-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673986#comment-13673986
 ] 

Ashutosh Chauhan commented on HIVE-4435:


Following tests failed: 
* compute_stats_double.q
* compute_stats_long.q
* compute_stats_string.q

I am assuming since we have better estimates now, we just need to update .q.out 
files for these. Can you verify and if so can you update the patch with it?

 Column stats: Distinct value estimator should use hash functions that are 
 pairwise independent
 --

 Key: HIVE-4435
 URL: https://issues.apache.org/jira/browse/HIVE-4435
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.10.0
Reporter: Shreepadma Venugopalan
Assignee: Shreepadma Venugopalan
 Attachments: chart_1(1).png, HIVE-4435.1.patch


 The current implementation of Flajolet-Martin estimator to estimate the 
 number of distinct values doesn't use hash functions that are pairwise 
 independent. This is problematic because the input values don't distribute 
 uniformly. When run on large TPC-H data sets, this leads to a huge 
 discrepancy for primary key columns. Primary key columns are typically a 
 monotonically increasing sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4418) TestNegativeCliDriver failure message if cmd succeeds is misleading

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4418:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Thejas!

 TestNegativeCliDriver failure message if cmd succeeds is misleading
 ---

 Key: HIVE-4418
 URL: https://issues.apache.org/jira/browse/HIVE-4418
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.10.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.12.0

 Attachments: HIVE-4418.1.patch


 If the .q test ends up succeeding (exit code == 0), then the test failure 
 message is misleading.
 From the error it seems as if the command actually failed -
 {code}
 [junit] junit.framework.AssertionFailedError: Client Execution failed 
 with error code = 0
 [junit] See build/ql/tmp/hive.log, or try ant test ... 
 -Dtest.silent=false to get more logs.
 [junit] at junit.framework.Assert.fail(Assert.java:47)
 [junit] at 
 org.apache.hadoop.hive.cli.TestNegativeCliDriver.runTest(TestNegativeCliDriver.java:121)
 [junit] at 
 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_desc_tab(TestNegativeCliDriver.java:102)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4585) Remove unused MR Temp file localization from Tasks

2013-06-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4585:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Gunther!

 Remove unused MR Temp file localization from Tasks
 --

 Key: HIVE-4585
 URL: https://issues.apache.org/jira/browse/HIVE-4585
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.12.0

 Attachments: HIVE-4585.1.patch


 HIVE-1408 introduced code that is currently commented out (i.e.: dead code), 
 with a comment saying needs further development (HIVE-1484). It's been like 
 this for close to 3 years. 
 I suggest removing the code until such time that someone picks up that work. 
 At that time they can decide if they want to use this code or pursue another 
 route (FS shim?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4646) skewjoin.q is failing in hadoop2

2013-06-03 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673998#comment-13673998
 ] 

Phabricator commented on HIVE-4646:
---

ashutoshc has accepted the revision HIVE-4646 [jira] skewjoin.q is failing in 
hadoop2.

  Stuffing this in shim probably is cleaner, but feels like overkill.  utility 
method is fine too.
  +1

REVISION DETAIL
  https://reviews.facebook.net/D11043

BRANCH
  HIVE-4646

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis


 skewjoin.q is failing in hadoop2
 

 Key: HIVE-4646
 URL: https://issues.apache.org/jira/browse/HIVE-4646
 Project: Hive
  Issue Type: Test
  Components: Query Processor
Reporter: Navis
Assignee: Navis
 Attachments: HIVE-4646.D11043.1.patch


 https://issues.apache.org/jira/browse/HDFS-538 changed to throw exception 
 instead of returning null for not-existing path. But skew resolver depends on 
 old behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4377) Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)

2013-06-03 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674000#comment-13674000
 ] 

Phabricator commented on HIVE-4377:
---

ashutoshc has accepted the revision HIVE-4377 [jira] Add more comment to 
https://reviews.facebook.net/D1209 (HIVE-2340).

  +1

REVISION DETAIL
  https://reviews.facebook.net/D10377

BRANCH
  HIVE-4377

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, navis
Cc: njain


 Add more comment to https://reviews.facebook.net/D1209 (HIVE-2340)
 --

 Key: HIVE-4377
 URL: https://issues.apache.org/jira/browse/HIVE-4377
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Gang Tim Liu
Assignee: Navis
 Attachments: HIVE-4377.D10377.1.patch, HIVE-4377.D10377.2.patch, 
 HIVE-4377.D10377.3.patch


 thanks a lot for addressing optimization in HIVE-2340. Awesome!
 Since we are developing at a very fast pace, it would be really useful to
 think about maintainability and testing of the large codebase. Highlights 
 which are applicable for D1209:
   1.  Javadoc for all public/private functions, except for
 setters/getters. For any complex function, clear examples (input/output)
 would really help.
   2.  Specially, for query optimizations, it might be a good idea to have
 a simple working query at the top, and the expected changes. For e.g..
 The operator tree for that query at each step, or a detailed explanation
 at the top.
   3.  If possible, the test name (.q file) where the function is being
 invoked, or the query which would potentially test that scenario, if it
 is a query processor change.
   4.  Comments in each test (.q file)­ that should include the jira
 number,  what is it trying to test. Assumptions about each query.
   5.  Reduce the output for each test ­ whenever query is outputting more
 than 10 results, it should have a reason. Otherwise, each query result
 should be bounded by 10 rows.
 thanks a lot

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4615) Invalid column names allowed when created dynamically by a SerDe

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674016#comment-13674016
 ] 

Hudson commented on HIVE-4615:
--

Integrated in Hive-trunk-h0.21 #2126 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2126/])
HIVE-4615 : Invalid column names allowed when created dynamically by a 
SerDe (Gabriel Reid via Ashutosh Chauhan) (Revision 1489013)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489013
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
* /hive/trunk/ql/src/test/queries/clientnegative/invalid_columns.q
* /hive/trunk/ql/src/test/results/clientnegative/invalid_columns.q.out


 Invalid column names allowed when created dynamically by a SerDe
 

 Key: HIVE-4615
 URL: https://issues.apache.org/jira/browse/HIVE-4615
 Project: Hive
  Issue Type: Bug
Reporter: Gabriel Reid
Assignee: Gabriel Reid
 Fix For: 0.12.0

 Attachments: HIVE-4615.1.patch.txt


 When a SerDe creates columns dynamically during table creation, there is no 
 checking done on the validity of the created column names. This means that 
 it's possible to create a table that contains columns that can't be queried, 
 and will lead to issues when trying to query the created table.
 The same column name validation should be performed for dynamically-created 
 columns as for other column names.
 This behavior can be easily tested using the TestSerDe, and including a 
 column name that includes an invalid identifier character (e.g. a period) in 
 the list of columns to create.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4636) Failing on TestSemanticAnalysis.testAddReplaceCols in trunk

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674010#comment-13674010
 ] 

Hudson commented on HIVE-4636:
--

Integrated in Hive-trunk-h0.21 #2126 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2126/])
HIVE-4636 : Failing on TestSemanticAnalysis.testAddReplaceCols in trunk 
(Navis via Ashutosh Chauhan) (Revision 1488824)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488824
Files : 
* 
/hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/cli/TestSemanticAnalysis.java


 Failing on TestSemanticAnalysis.testAddReplaceCols in trunk
 ---

 Key: HIVE-4636
 URL: https://issues.apache.org/jira/browse/HIVE-4636
 Project: Hive
  Issue Type: Test
  Components: Tests
Affects Versions: 0.12.0
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.12.0

 Attachments: HIVE-4636.D11013.1.patch


 Seemed regression from HIVE-4475. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4610) HCatalog checkstyle violation after HIVE-4578

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674008#comment-13674008
 ] 

Hudson commented on HIVE-4610:
--

Integrated in Hive-trunk-h0.21 #2126 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2126/])
HIVE-4610 : HCatalog checkstyle violation after HIVE4578 (Brock Noland via 
Ashutosh Chauhan) (Revision 1488825)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488825
Files : 
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/default.res
* /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/windows.res


 HCatalog checkstyle violation after HIVE-4578
 -

 Key: HIVE-4610
 URL: https://issues.apache.org/jira/browse/HIVE-4610
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.12.0

 Attachments: HIVE-4610-0.patch


 {noformat}
 checkstyle:
  [echo] hcatalog
 [checkstyle] Running Checkstyle 5.5 on 413 files
 [checkstyle] 
 /home/brock/workspaces/hive-apache/hive/hcatalog/src/test/e2e/hcatalog/resource/default.res:1:
  Missing a header - not enough lines in file.
 [checkstyle] 
 /home/brock/workspaces/hive-apache/hive/hcatalog/src/test/e2e/hcatalog/resource/windows.res:1:
  Missing a header - not enough lines in file.
   [for] hcatalog: The following error occurred while executing this line:
   [for] /home/brock/workspaces/hive-apache/hive/build.xml:310: The 
 following error occurred while executing this line:
   [for] /home/brock/workspaces/hive-apache/hive/hcatalog/build.xml:109: 
 The following error occurred while executing this line:
   [for] 
 /home/brock/workspaces/hive-apache/hive/hcatalog/build-support/ant/checkstyle.xml:32:
  Got 2 errors and 0 warnings.
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3846) alter view rename NPEs with authorization on.

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674017#comment-13674017
 ] 

Hudson commented on HIVE-3846:
--

Integrated in Hive-trunk-h0.21 #2126 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2126/])
HIVE-3846 : alter view rename NPEs with authorization on. (Teddy Choi via 
Ashutosh Chauhan) (Revision 1489009)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489009
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java
* /hive/trunk/ql/src/test/queries/clientpositive/authorization_8.q
* /hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter_view_rename.q.out
* /hive/trunk/ql/src/test/results/clientpositive/authorization_8.q.out


 alter view rename NPEs with authorization on.
 -

 Key: HIVE-3846
 URL: https://issues.apache.org/jira/browse/HIVE-3846
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.10.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Teddy Choi
 Fix For: 0.12.0

 Attachments: HIVE-3846.1.patch.txt, HIVE-3846.2.patch.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674011#comment-13674011
 ] 

Hudson commented on HIVE-4403:
--

Integrated in Hive-trunk-h0.21 #2126 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2126/])
HIVE-4403 : Running Hive queries on Yarn (MR2) gives warnings related to 
overriding final parameters (Chu Tong via Ashutosh Chauhan) (Revision 1489008)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489008
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java


 Running Hive queries on Yarn (MR2) gives warnings related to overriding final 
 parameters
 

 Key: HIVE-4403
 URL: https://issues.apache.org/jira/browse/HIVE-4403
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0, 0.11.0
Reporter: Mark Grover
Assignee: Chu Tong
 Fix For: 0.12.0

 Attachments: HIVE-4403.patch, HIVE-4403.patch


 While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings 
 related to overriding final parameters in job.conf. This was on a pseudo 
 distributed cluster. FWIW, I didn't see this happen on a fully-distributed 
 cluster. Perhaps, Hive's job.conf is overriding some final parameters it 
 shouldn't.
 Here is what the warnings looked like:
 {code}
 2013-04-19 14:20:32,304 WARN  [main] conf.Configuration 
 (Configuration.java:loadProperty(2032)) - 
 file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.retry.interval;  Ignoring.
 2013-04-19 14:20:32,367 WARN  [main] conf.Configuration 
 (Configuration.java:loadProperty(2032)) - 
 file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
  attempt to override final parameter: 
 mapreduce.job.end-notification.max.attempts;  Ignoring.
 {code}
 To reproduce, run a query like:
 {code}
 CREATE TABLE u_data (
   userid INT,
   movieid INT,
   rating INT,
   unixtime STRING)
 ROW FORMAT DELIMITED
 FIELDS TERMINATED BY '\t'
 STORED AS TEXTFILE;
 {code}
 Load some data into u_data, here is some sample data:
 https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data
 Run a simple query on that data (on YARN/MR2)
 {code}
 INSERT OVERWRITE DIRECTORY '/tmp/count'
 SELECT COUNT(1) FROM u_data
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4562) HIVE-3393 brought in Jackson library,and these four jars should be packed into hive-exec.jar

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674013#comment-13674013
 ] 

Hudson commented on HIVE-4562:
--

Integrated in Hive-trunk-h0.21 #2126 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2126/])
HIVE-4562 : HIVE3393 brought in Jackson library,and these four jars should 
be packed into hive-exec.jar (caofangkun via Ashutosh Chauhan) (Revision 
1488744)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488744
Files : 
* /hive/trunk/ql/build.xml


 HIVE-3393 brought in Jackson library,and these four jars should be packed 
 into hive-exec.jar
 

 Key: HIVE-4562
 URL: https://issues.apache.org/jira/browse/HIVE-4562
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.10.0, 0.11.0
Reporter: caofangkun
Assignee: caofangkun
Priority: Minor
 Fix For: 0.12.0

 Attachments: HIVE-4562-1.patch, HIVE-4562-2.patch


 Some jars of Hive are required not only by the client but also the server 
 (every Hadoop slave),
 though we could use 'add jar' command to add all the jars in dis-cache ,
 but in common way ,we may add these jars in $HADOOP_HOME/lib/  of every salve 
 of the Hadoop Cluster,
 and need restart all the tasktrackers .
 For example:
 When using hive stats, If we use mysql as tmp stats db ,every salve of the 
 Hadoop Cluster should contain 
 mysql-connector-java-.jar in $HADOOP_HOME/lib/ 
 And for column stats 
 In all slaves $HADOOP_HOME/lib/ should contain:
 jackson-core-asl-1.8.8.jar
 jackson-jaxrs-1.8.8.jar
 jackson-mapper-asl-1.8.8.jar
 jackson-xc-1.8.8.jar
 These jars should be separated  from other common client-side-jars .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4510) HS2 doesn't nest exceptions properly (fun debug times)

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674012#comment-13674012
 ] 

Hudson commented on HIVE-4510:
--

Integrated in Hive-trunk-h0.21 #2126 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2126/])
HIVE-4510 : HS2 doesn't nest exceptions properly (fun debug times) (Thejas 
Nair via Ashutosh Chauhan) (Revision 1488740)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488740
Files : 
* /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java
* /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
* /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java
* /hive/trunk/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java
* 
/hive/trunk/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java


 HS2 doesn't nest exceptions properly (fun debug times)
 --

 Key: HIVE-4510
 URL: https://issues.apache.org/jira/browse/HIVE-4510
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Reporter: Gunther Hagleitner
Assignee: Thejas M Nair
 Fix For: 0.12.0

 Attachments: HIVE-4510.1.patch, HIVE-4510.2.patch


 In SQLOperation.java lines 97 + 113 for instance, we catch errors and throw a 
 new HiveSQLException, but we don't wrap the original exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4489) beeline always return the same error message twice

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674015#comment-13674015
 ] 

Hudson commented on HIVE-4489:
--

Integrated in Hive-trunk-h0.21 #2126 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2126/])
HIVE-4489 : beeline always return the same error message twice (Chaoyu Tang 
via Ashutosh Chauhan) (Revision 1488741)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1488741
Files : 
* /hive/trunk/beeline/src/java/org/apache/hive/beeline/Commands.java


 beeline always return the same error message twice
 --

 Key: HIVE-4489
 URL: https://issues.apache.org/jira/browse/HIVE-4489
 Project: Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 0.10.0
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
Priority: Minor
  Labels: newbie
 Fix For: 0.12.0

 Attachments: HIVE-4489.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 Beeline always returns the same error message twice. for example, if I try to 
 create a table a2 which already exists, it prints out two exact same messages 
 and it is not quite user friendly.
 {code}
 beeline !connect jdbc:hive2://localhost:1 scott tiger 
 org.apache.hive.jdbc.HiveDriver
 Connecting to jdbc:hive2://localhost:1
 Connected to: Hive (version 0.10.0)
 Driver: Hive (version 0.10.0-cdh4.2.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 0: jdbc:hive2://localhost:1 create table a2 (value int);
 Error: Error while processing statement: FAILED: Execution Error, return code 
 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
 Error: Error while processing statement: FAILED: Execution Error, return code 
 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >