date:20140317


[ 
https://issues.apache.org/jira/browse/HIVE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937485#comment-13937485
 ] 

Vaibhav Gumashta commented on HIVE-4764:


[~thejas] The failure looks unrelated to this jira change.

 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 -

 Key: HIVE-4764
 URL: https://issues.apache.org/jira/browse/HIVE-4764
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4764.1.patch, HIVE-4764.2.patch, HIVE-4764.3.patch, 
 HIVE-4764.4.patch, HIVE-4764.5.patch, HIVE-4764.6.patch


 Support Kerberos authentication for HiveServer2 running in http mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Status: Open  (was: Patch Available)

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.12.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Attachment: HIVE-6681.3.patch

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Status: Patch Available  (was: Open)

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.12.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6658) Modify Alter_numbuckets* test to reflect hadoop2 changes


 [ 
https://issues.apache.org/jira/browse/HIVE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6658:
---

Status: Open  (was: Patch Available)

[~szehon] has valid point. [~jpullokkaran] Would you like to redo in/exclude 
version numbers.

 Modify Alter_numbuckets* test to reflect hadoop2 changes
 

 Key: HIVE-6658
 URL: https://issues.apache.org/jira/browse/HIVE-6658
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-6658.patch


 Hadoop2 now honors number of reducers config while running in local mode. 
 This affects bucketing tests as the data gets properly bucketed in Hadoop2 
 (In hadoop1 all data ended up in same bucket while in local mode).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-1180) Support Common Table Expressions (CTEs) in Hive

2014-03-17 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937523#comment-13937523
 ] 

Lefty Leverenz commented on HIVE-1180:
--

Apologies for the delay, [~rhbutani].  I made these changes (please revert 
anything you disagree with):

* [changes to Common Table Expression 
|https://cwiki.apache.org/confluence/pages/diffpages.action?pageId=38572242originalId=40504878]
* [current CTE doc 
|https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression]

I also recommend changing COMMA to , in the syntax -- just because I recently 
got confused by DOT for . in the DDL wiki -- but since the examples show 
actual commas there shouldn't be any confusion here.

 Support Common Table Expressions (CTEs) in Hive
 ---

 Key: HIVE-1180
 URL: https://issues.apache.org/jira/browse/HIVE-1180
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Jeff Hammerbacher
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-1180.1.patch, HIVE-1180.3.patch, HIVE-1180.6.patch


 I've seen some presentations from the PostgreSQL recently expounding the 
 utility of CTEs (http://en.wikipedia.org/wiki/Common_table_expressions). 
 Should we try to support these in Hive? I've never used them in practice, so 
 curious to hear if the community would find them useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead


 [ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6430:
---

Attachment: HIVE-6430.04.patch

Add unit test, fixes

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
 HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead

2014-03-17 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18936/
---

(Updated March 17, 2014, 8:24 a.m.)


Review request for hive, Gopal V and Gunther Hagleitner.


Repository: hive-git


Description
---

See JIRA


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 56d68f5 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 2cd65cb 
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
 704fcb9 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 7dbb8be 
  ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 170e8c0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 3ea9c96 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
 8854b19 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
9df425b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
a00aab3 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 
008a8db 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
 988959f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 55b7415 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 79af08d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
eef7656 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java 
0fd4983 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
 65e3779 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
 755d783 
  ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 
  ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out d79b984 
  ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 
  serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 
  serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 
5870884 
  
serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
 bab505e 
  serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 
6f344bb 
  serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd 
  serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java 
a99c7b4 
  serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 
435d6c6 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
82c1263 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 
b188c3f 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUtils.java 
6c14081 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
 06d5c5e 
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java 
868dd4c 
  
serde/src/test/org/apache/hadoop/hive/serde2/thrift_test/CreateSequenceFile.java
 1fb49e5 

Diff: https://reviews.apache.org/r/18936/diff/


Testing
---


Thanks,

Sergey Shelukhin

[jira] [Updated] (HIVE-6331) HIVE-5279 deprecated UDAF class without explanation/documentation/alternative

2014-03-17 Thread Lars Francke (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Francke updated HIVE-6331:
---

Attachment: HIVE-6331.2.patch

This new patch addresses Lefty's comments

 HIVE-5279 deprecated UDAF class without explanation/documentation/alternative
 -

 Key: HIVE-6331
 URL: https://issues.apache.org/jira/browse/HIVE-6331
 Project: Hive
  Issue Type: Bug
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Attachments: HIVE-5279.1.patch, HIVE-6331.2.patch


 HIVE-5279 added a @Deprecated annotation to the {{UDAF}} class. The comment 
 in that class says {quote}UDAF classes are REQUIRED to inherit from this 
 class.{quote}
 One of these two needs to be updated. Either remove the annotation or 
 document why it was deprecated and what to use instead.
 Unfortunately [~navis] did not leave any documentation about his intentions.
 I'm happy to provide a patch once I know the intentions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6657) Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc


[ 
https://issues.apache.org/jira/browse/HIVE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937682#comment-13937682
 ] 

Hive QA commented on HIVE-6657:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12634988/HIVE-6657.4.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5402 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1857/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1857/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12634988

 Add test coverage for Kerberos authentication implementation using Hadoop's 
 miniKdc
 ---

 Key: HIVE-6657
 URL: https://issues.apache.org/jira/browse/HIVE-6657
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, Testing Infrastructure, Tests
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-6657.2.patch, HIVE-6657.3.patch, HIVE-6657.4.patch, 
 HIVE-6657.4.patch


 Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
 downstream projects to implement unit tests for Kerberos authentication code.
 Hive has lot of code related to Kerberos and delegation token for 
 authentication, as well as accessing secure hadoop resources. This pretty 
 much has no coverage in the unit tests. We needs to add unit tests using 
 miniKdc module.
 Note that Hadoop 2.3 doesn't include a secure mini-cluster. Until that is 
 available, we can at least test authentication for components like 
 HiveServer2, Metastore and WebHCat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5998) Add vectorized reader for Parquet files

2014-03-17 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-5998:
---

Status: Open  (was: Patch Available)

 Add vectorized reader for Parquet files
 ---

 Key: HIVE-5998
 URL: https://issues.apache.org/jira/browse/HIVE-5998
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers, Vectorization
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: Parquet, vectorization
 Attachments: HIVE-5998.1.patch, HIVE-5998.10.patch, 
 HIVE-5998.2.patch, HIVE-5998.3.patch, HIVE-5998.4.patch, HIVE-5998.5.patch, 
 HIVE-5998.6.patch, HIVE-5998.7.patch, HIVE-5998.8.patch, HIVE-5998.9.patch


 HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar 
 format, it makes sense to provide a vectorized reader, similar to how RC and 
 ORC formats have, to benefit from vectorized execution engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5998) Add vectorized reader for Parquet files

2014-03-17 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-5998:
---

Attachment: HIVE-5998.10.patch

Rebased and update expected result with latest changes in trunk

 Add vectorized reader for Parquet files
 ---

 Key: HIVE-5998
 URL: https://issues.apache.org/jira/browse/HIVE-5998
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers, Vectorization
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: Parquet, vectorization
 Attachments: HIVE-5998.1.patch, HIVE-5998.10.patch, 
 HIVE-5998.2.patch, HIVE-5998.3.patch, HIVE-5998.4.patch, HIVE-5998.5.patch, 
 HIVE-5998.6.patch, HIVE-5998.7.patch, HIVE-5998.8.patch, HIVE-5998.9.patch


 HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar 
 format, it makes sense to provide a vectorized reader, similar to how RC and 
 ORC formats have, to benefit from vectorized execution engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-5998) Add vectorized reader for Parquet files

2014-03-17 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-5998:
---

Status: Patch Available  (was: Open)

 Add vectorized reader for Parquet files
 ---

 Key: HIVE-5998
 URL: https://issues.apache.org/jira/browse/HIVE-5998
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers, Vectorization
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: Parquet, vectorization
 Attachments: HIVE-5998.1.patch, HIVE-5998.10.patch, 
 HIVE-5998.2.patch, HIVE-5998.3.patch, HIVE-5998.4.patch, HIVE-5998.5.patch, 
 HIVE-5998.6.patch, HIVE-5998.7.patch, HIVE-5998.8.patch, HIVE-5998.9.patch


 HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar 
 format, it makes sense to provide a vectorized reader, similar to how RC and 
 ORC formats have, to benefit from vectorized execution engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6657) Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-17 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937748#comment-13937748
 ] 

Brock Noland commented on HIVE-6657:


I think those failures are unrelated. +1

 Add test coverage for Kerberos authentication implementation using Hadoop's 
 miniKdc
 ---

 Key: HIVE-6657
 URL: https://issues.apache.org/jira/browse/HIVE-6657
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, Testing Infrastructure, Tests
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-6657.2.patch, HIVE-6657.3.patch, HIVE-6657.4.patch, 
 HIVE-6657.4.patch


 Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
 downstream projects to implement unit tests for Kerberos authentication code.
 Hive has lot of code related to Kerberos and delegation token for 
 authentication, as well as accessing secure hadoop resources. This pretty 
 much has no coverage in the unit tests. We needs to add unit tests using 
 miniKdc module.
 Note that Hadoop 2.3 doesn't include a secure mini-cluster. Until that is 
 available, we can at least test authentication for components like 
 HiveServer2, Metastore and WebHCat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6677) HBaseSerDe needs to be refactored


[ 
https://issues.apache.org/jira/browse/HIVE-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937769#comment-13937769
 ] 

Hive QA commented on HIVE-6677:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12634995/HIVE-6677.2.patch

{color:green}SUCCESS:{color} +1 5406 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1858/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1858/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12634995

 HBaseSerDe needs to be refactored
 -

 Key: HIVE-6677
 URL: https://issues.apache.org/jira/browse/HIVE-6677
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0, 0.11.0, 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6677.1.patch, HIVE-6677.2.patch, HIVE-6677.patch


 The code in HBaseSerde seems very complex and hard to be extend to support 
 new features such as adding generic compound key (HIVE-6411) and Compound key 
 filter (HIVE-6290), especially when handling key/field serialization. Hope 
 this task will clean up the code a bit and make it ready for new extensions. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-4764) Support Kerberos HTTP authentication for HiveServer2 running in http mode

2014-03-17 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4764:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch (finally!) committed to trunk and 0.13 branch (in the cwiki list for 0.13 
).
Thanks for the contribution and patience Vaibhav!


 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 -

 Key: HIVE-4764
 URL: https://issues.apache.org/jira/browse/HIVE-4764
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4764.1.patch, HIVE-4764.2.patch, HIVE-4764.3.patch, 
 HIVE-4764.4.patch, HIVE-4764.5.patch, HIVE-4764.6.patch


 Support Kerberos authentication for HiveServer2 running in http mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6660) HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request

2014-03-17 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937775#comment-13937775
 ] 

Thejas M Nair commented on HIVE-6660:
-

+1

 HiveServer2 running in non-http mode closes server socket for an SSL 
 connection after the 1st request
 -

 Key: HIVE-6660
 URL: https://issues.apache.org/jira/browse/HIVE-6660
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Prasad Mujumdar
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6660.1.patch, HIVE-6660.1.patch, hive-site.xml


 *Beeline connection string:*
 {code}
 !connect 
 jdbc:hive2://host:1/;ssl=true;sslTrustStore=/usr/share/doc/hive-0.13.0.2.1.1.0/examples/files/truststore.jks;trustStorePassword=HiveJdbc
  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver 
 {code}
 *Error:*
 {code}
 pool-7-thread-1, handling exception: java.net.SocketTimeoutException: Read 
 timed out
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 pool-7-thread-1, SEND TLSv1 ALERT:  warning, description = close_notify
 Padded plaintext before ENCRYPTION:  len = 32
 : 01 00 BE 72 AC 10 3B FA   4E 01 A5 DE 9B 14 16 AF  ...r..;.N...
 0010: 4E DD 7A 29 AD B4 09 09   09 09 09 09 09 09 09 09  N.z)
 pool-7-thread-1, WRITE: TLSv1 Alert, length = 32
 [Raw write]: length = 37
 : 15 03 01 00 20 6C 37 82   A8 52 40 DA FB 83 2D CD   l7..R@...-.
 0010: 96 9F F0 B7 22 17 E1 04   C1 D1 93 1B C4 39 5A B0  9Z.
 0020: A2 3F 5D 7D 2D .?].-
 pool-7-thread-1, called closeSocket(selfInitiated)
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 {code}
 *Subsequent queries fail:*
 {code}
 main, WRITE: TLSv1 Application Data, length = 144
 main, handling exception: java.net.SocketException: Broken pipe
 %% Invalidated:  [Session-1, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
 main, SEND TLSv1 ALERT:  fatal, description = unexpected_message
 Padded plaintext before ENCRYPTION:  len = 32
 : 02 0A 52 C3 18 B1 C1 38   DB 3F B6 D1 C5 CA 14 9C  ..R8.?..
 0010: A5 38 4C 01 31 69 09 09   09 09 09 09 09 09 09 09  .8L.1i..
 main, WRITE: TLSv1 Alert, length = 32
 main, Exception sending alert: java.net.SocketException: Broken pipe
 main, called closeSocket()
 Error: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe (state=08S01,code=0)
 java.sql.SQLException: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:226)
   at org.apache.hive.beeline.Commands.execute(Commands.java:736)
   at org.apache.hive.beeline.Commands.sql(Commands.java:657)
   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:796)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe
   at 
 org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
   at 
 org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:471)
   at 
 org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37)
   at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:219)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:211)
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:220)
   ... 11 more
 Caused by: java.net.SocketException: Broken pipe
   at java.net.SocketOutputStream.socketWrite0(Native Method)
   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
   at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
   at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:377)
   at

[jira] [Commented] (HIVE-1180) Support Common Table Expressions (CTEs) in Hive

2014-03-17 Thread Harish Butani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937867#comment-13937867
 ] 

Harish Butani commented on HIVE-1180:
-

Looks good, thanks for editing.
Changed COMMA to ,

 Support Common Table Expressions (CTEs) in Hive
 ---

 Key: HIVE-1180
 URL: https://issues.apache.org/jira/browse/HIVE-1180
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Jeff Hammerbacher
Assignee: Harish Butani
 Fix For: 0.13.0

 Attachments: HIVE-1180.1.patch, HIVE-1180.3.patch, HIVE-1180.6.patch


 I've seen some presentations from the PostgreSQL recently expounding the 
 utility of CTEs (http://en.wikipedia.org/wiki/Common_table_expressions). 
 Should we try to support these in Hive? I've never used them in practice, so 
 curious to hear if the community would find them useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


[ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13937994#comment-13937994
 ] 

Hive QA commented on HIVE-6681:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635040/HIVE-6681.3.patch

{color:red}ERROR:{color} -1 due to 180 failed/errored test(s), 5397 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2_hadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_like_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1

[jira] [Commented] (HIVE-6657) Add test coverage for Kerberos authentication implementation using Hadoop's miniKdc

2014-03-17 Thread Prasad Mujumdar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938010#comment-13938010
 ] 

Prasad Mujumdar commented on HIVE-6657:
---

yes, test failures are not related to the patch. Thanks Brock for the review!

 Add test coverage for Kerberos authentication implementation using Hadoop's 
 miniKdc
 ---

 Key: HIVE-6657
 URL: https://issues.apache.org/jira/browse/HIVE-6657
 Project: Hive
  Issue Type: Improvement
  Components: Authentication, Testing Infrastructure, Tests
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Attachments: HIVE-6657.2.patch, HIVE-6657.3.patch, HIVE-6657.4.patch, 
 HIVE-6657.4.patch


 Hadoop 2.3 includes miniKdc module. This provides a KDC that can be used by 
 downstream projects to implement unit tests for Kerberos authentication code.
 Hive has lot of code related to Kerberos and delegation token for 
 authentication, as well as accessing secure hadoop resources. This pretty 
 much has no coverage in the unit tests. We needs to add unit tests using 
 miniKdc module.
 Note that Hadoop 2.3 doesn't include a secure mini-cluster. Until that is 
 available, we can at least test authentication for components like 
 HiveServer2, Metastore and WebHCat.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered

[
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jitendra Nath Pandey updated HIVE-6518:
---

Resolution: Fixed
Fix Version/s: 0.14.0
Status: Resolved (was: Patch Available)

I have committed this trunk. Thanks to Gopal!

[~rhbutani] This is an important fix to vector group by because the aggregates
must flush more aggressively in case of GC. Therefore, I intend to commit it to
branch-0.13. as well.

Add a GC canary to the VectorGroupByOperator to flush whenever a GC is
triggered

Key: HIVE-6518
URL: https://issues.apache.org/jira/browse/HIVE-6518
Project: Hive
Issue Type: Bug
Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
Fix For: 0.14.0

Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch,
HIVE-6518.2.patch, HIVE-6518.3.patch

The current VectorGroupByOperator implementation flushes the in-memory hashes
when the maximum entries or fraction of memory is hit.
This works for most cases, but there are some corner cases where we hit GC
ovehead limits or heap size limits before either of those conditions are
reached due to the rest of the pipeline.
This patch adds a SoftReference as a GC canary. If the soft reference is
dead, then a full GC pass happened sometime in the near past the
aggregation hashtables should be flushed immediately before another full GC
is triggered.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6612) Misspelling schemaTool completeted


 [ 
https://issues.apache.org/jira/browse/HIVE-6612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6612:


Attachment: HIVE-6612.2.patch

Attaching to requeue for precommit test.

 Misspelling schemaTool completeted 
 -

 Key: HIVE-6612
 URL: https://issues.apache.org/jira/browse/HIVE-6612
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.13.0
Reporter: Szehon Ho
Assignee: Szehon Ho
Priority: Minor
 Attachments: HIVE-6612.2.patch, HIVE-6612.patch


 There is a misspelling of completed as completeted in the last message 
 from schematool:
 {noformat}
 Metastore connection URL:  
 jdbc:derby:;databaseName=metastore_db;create=true
 Metastore Connection Driver :  org.apache.derby.jdbc.EmbeddedDriver
 Metastore connection User: hiveuser
 Starting metastore schema initialization to 0.14.0
 Initialization script hive-schema-0.14.0.derby.sql
 Initialization script completed
 schemaTool completeted
 {noformat}
 It is this way even in the wiki:  
 [https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool|https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered

2014-03-17 Thread Harish Butani (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938056#comment-13938056
 ] 

Harish Butani commented on HIVE-6518:
-

+1 for port to 0.13

 Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
 triggered
 

 Key: HIVE-6518
 URL: https://issues.apache.org/jira/browse/HIVE-6518
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, 
 HIVE-6518.2.patch, HIVE-6518.3.patch


 The current VectorGroupByOperator implementation flushes the in-memory hashes 
 when the maximum entries or fraction of memory is hit.
 This works for most cases, but there are some corner cases where we hit GC 
 ovehead limits or heap size limits before either of those conditions are 
 reached due to the rest of the pipeline.
 This patch adds a SoftReference as a GC canary. If the soft reference is 
 dead, then a full GC pass happened sometime in the near past  the 
 aggregation hashtables should be flushed immediately before another full GC 
 is triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-4764) Support Kerberos HTTP authentication for HiveServer2 running in http mode


[ 
https://issues.apache.org/jira/browse/HIVE-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938084#comment-13938084
 ] 

Vaibhav Gumashta commented on HIVE-4764:


Thanks for the reviews [~thejas].

 Support Kerberos HTTP authentication for HiveServer2 running in http mode
 -

 Key: HIVE-4764
 URL: https://issues.apache.org/jira/browse/HIVE-4764
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4764.1.patch, HIVE-4764.2.patch, HIVE-4764.3.patch, 
 HIVE-4764.4.patch, HIVE-4764.5.patch, HIVE-4764.6.patch


 Support Kerberos authentication for HiveServer2 running in http mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-6485) Downgrade to httpclient-4.2.5 in JDBC from httpclient-4.3.2


 [ 
https://issues.apache.org/jira/browse/HIVE-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-6485.


Resolution: Fixed

Fixed in HIVE-4764.

 Downgrade to httpclient-4.2.5 in JDBC from httpclient-4.3.2
 ---

 Key: HIVE-6485
 URL: https://issues.apache.org/jira/browse/HIVE-6485
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6485.1.patch


 Had upgraded to the new version while adding SSL over Http mode support for 
 HiveServer2. But that conflicts with httpclient-4.2.5 which is in hadoop 
 classpath. I don't have a good reason to use httpclient-4.3.2, so it's better 
 to match hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6645) to_date()/to_unix_timestamp() fail with NPE if input is null

2014-03-17 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6645:
-

Attachment: HIVE-6645.2.patch

re-uploading patch for precommit tests

 to_date()/to_unix_timestamp() fail with NPE if input is null
 

 Key: HIVE-6645
 URL: https://issues.apache.org/jira/browse/HIVE-6645
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6645.1.patch, HIVE-6645.2.patch, HIVE-6645.2.patch


 {noformat}
 hive describe tab2;
 Query ID = jdere_20140312185454_e3ed213e-8b3a-4963-b815-19965edad587
 OK
 c1timestamp   None
 Time taken: 0.155 seconds, Fetched: 1 row(s)
 hive select * from tab2;
 Query ID = jdere_20140312185454_8a009070-df79-45de-8642-e85668a378d7
 OK
 NULL
 NULL
 NULL
 NULL
 NULL
 Time taken: 0.067 seconds, Fetched: 5 row(s)
 hive select to_unix_timestamp(c1) from tab2;   
 hive select to_date(c1) from tab2;  
 {noformat}
 Fails with errors like:
 {noformat}
 java.lang.Exception: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {c1:null}
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:401)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {c1:null}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:233)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:680)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {c1:null}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 10 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 to_date(c1)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 11 more
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFDate.evaluate(GenericUDFDate.java:106)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79)
 ... 15 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6488) Investigate TestBeeLineWithArgs

2014-03-17 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6488:
-

Attachment: HIVE-6488.1.patch

re-upload patch for precommit tests

 Investigate TestBeeLineWithArgs
 ---

 Key: HIVE-6488
 URL: https://issues.apache.org/jira/browse/HIVE-6488
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Brock Noland
Assignee: Jason Dere
Priority: Blocker
 Attachments: HIVE-6488.1.patch, HIVE-6488.1.patch


 TestBeeLineWithArgs started taking many, many hours and eventually timing out 
 which is one cause of precommit runs taking a long time. For now I have 
 skipped it in for precommit tests so we should figure out what is going on so 
 we can re-enable the test.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-17 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938120#comment-13938120
 ] 

Vikram Dixit K commented on HIVE-6639:
--

LGTM +1.

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command

2014-03-17 Thread Prasanth J (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938123#comment-13938123
 ] 

Prasanth J commented on HIVE-6578:
--

Test failure is not related.

 Use ORC file footer statistics through StatsProvidingRecordReader interface 
 for analyze command
 ---

 Key: HIVE-6578
 URL: https://issues.apache.org/jira/browse/HIVE-6578
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, 
 HIVE-6578.4.patch, HIVE-6578.4.patch.txt


 ORC provides file level statistics which can be used in analyze partialscan 
 and noscan cases to compute basic statistics like number of rows, number of 
 files, total file size and raw data size. On the writer side, a new interface 
 was added earlier (StatsProvidingRecordWriter) that exposed stats when 
 writing a table. Similarly, a new interface StatsProvidingRecordReader can be 
 added which when implemented should provide stats that are gathered by the 
 underlying file format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6683) Beeline does not accept comments at end of line

2014-03-17 Thread Jeremy Beard (JIRA)

Jeremy Beard created HIVE-6683:
--

 Summary: Beeline does not accept comments at end of line
 Key: HIVE-6683
 URL: https://issues.apache.org/jira/browse/HIVE-6683
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Jeremy Beard


Beeline fails to read queries where lines have comments at the end. This works 
in the embedded Hive CLI.

Example:

SELECT
1 -- this is a comment about this value
FROM
table;

Error: Error while processing statement: FAILED: ParseException line 1:36 
mismatched input 'EOF' expecting FROM near '1' in from clause 
(state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6682) nonstaged mapjoin table memory check may be broken

Sergey Shelukhin created HIVE-6682:
--

 Summary: nonstaged mapjoin table memory check may be broken
 Key: HIVE-6682
 URL: https://issues.apache.org/jira/browse/HIVE-6682
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


We are getting the below error from task while the staged load works correctly. 
We don't set the memory threshold so low so it seems the settings are just not 
handled correctly. This seems to always trigger on the first check. Given that 
map task might have bunch more stuff, not just the hashmap, we may also need to 
adjust the memory check (e.g. have separate configs).

{noformat}
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
19  Memory usage:   204001888   percentage: 0.197
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
19  Memory usage:   204001888   percentage: 0.197
at 
org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
... 8 more
Caused by: 
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
2014-03-14 08:11:21 Processing rows:20  Hashtable size: 19  
Memory usage:   204001888   percentage: 0.197
at 
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
at 
org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346)
at 
org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147)
at 
org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82)
... 15 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6684) Beeline does not accept comments that are preceded by spaces

2014-03-17 Thread Jeremy Beard (JIRA)

Jeremy Beard created HIVE-6684:
--

 Summary: Beeline does not accept comments that are preceded by 
spaces
 Key: HIVE-6684
 URL: https://issues.apache.org/jira/browse/HIVE-6684
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Jeremy Beard


Beeline throws an error if single-line comments are indented with spaces. This 
works in the embedded Hive CLI.

For example:

SELECT
   -- this is the field we want
   field
FROM
   table;

Error: Error while processing statement: FAILED: ParseException line 1:71 
cannot recognize input near 'EOF' 'EOF' 'EOF' in select clause 
(state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Status: Open  (was: Patch Available)

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.12.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6682) nonstaged mapjoin table memory check may be broken


[ 
https://issues.apache.org/jira/browse/HIVE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938155#comment-13938155
 ] 

Sergey Shelukhin commented on HIVE-6682:


I suspect that hashtableMemoryUsage is desc is 0

 nonstaged mapjoin table memory check may be broken
 --

 Key: HIVE-6682
 URL: https://issues.apache.org/jira/browse/HIVE-6682
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 We are getting the below error from task while the staged load works 
 correctly. 
 We don't set the memory threshold so low so it seems the settings are just 
 not handled correctly. This seems to always trigger on the first check. Given 
 that map task might have bunch more stuff, not just the hashmap, we may also 
 need to adjust the memory check (e.g. have separate configs).
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82)
   ... 15 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6684) Beeline does not accept comments that are preceded by spaces

2014-03-17 Thread Jeremy Beard (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938156#comment-13938156
 ] 

Jeremy Beard commented on HIVE-6684:


The description doesn't render properly on my browser - the 2nd, 3rd and 5th 
lines of the example start with three spaces.

 Beeline does not accept comments that are preceded by spaces
 

 Key: HIVE-6684
 URL: https://issues.apache.org/jira/browse/HIVE-6684
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Jeremy Beard

 Beeline throws an error if single-line comments are indented with spaces. 
 This works in the embedded Hive CLI.
 For example:
 SELECT
-- this is the field we want
field
 FROM
table;
 Error: Error while processing statement: FAILED: ParseException line 1:71 
 cannot recognize input near 'EOF' 'EOF' 'EOF' in select clause 
 (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6682) nonstaged mapjoin table memory check may be broken


[ 
https://issues.apache.org/jira/browse/HIVE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938157#comment-13938157
 ] 

Sergey Shelukhin commented on HIVE-6682:


*in

 nonstaged mapjoin table memory check may be broken
 --

 Key: HIVE-6682
 URL: https://issues.apache.org/jira/browse/HIVE-6682
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 We are getting the below error from task while the staged load works 
 correctly. 
 We don't set the memory threshold so low so it seems the settings are just 
 not handled correctly. This seems to always trigger on the first check. Given 
 that map task might have bunch more stuff, not just the hashmap, we may also 
 need to adjust the memory check (e.g. have separate configs).
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82)
   ... 15 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Status: Patch Available  (was: Open)

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.12.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.4.patch, 
 HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Attachment: HIVE-6681.4.patch

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.4.patch, 
 HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6489) Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership

2014-03-17 Thread Joe Rao (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938184#comment-13938184
]

Joe Rao commented on HIVE-6489:
---

A succinct way of wording the problem is:
- Hadoop daemons create the /tmp/hive-username directory with their group
ownership
- User data loaded via LOAD DATA LOCAL INPATH is staged in
/tmp/hive-username, inheriting its group ownership
- User data is moved to the table directory, but keeps the group ownership of
/tmp/hive-username

The desired behavior is:
- Data loaded with LOAD DATA LOCAL INPATH inherits the group ownership of the
table directory

This could be solved by:
- Removing the need to stage the data in /tmp, OR
- Adding a step to LOAD DATA LOCAL INPATH to change the group ownership, after
the load completes (this one is probably an easier solution)

Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership
-

Key: HIVE-6489
URL: https://issues.apache.org/jira/browse/HIVE-6489
Project: Hive
Issue Type: Bug
Components: Import/Export
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0
Environment: OS and hardware are irrelevant. Tested and reproduced
on multiple configurations, including SLES, RHEL, VM, Teradata Hadoop
Appliance, HDP 1.1, HDP 1.3.2, HDP 2.0.
Reporter: Joe Rao
Priority: Minor
Original Estimate: 24h
Remaining Estimate: 24h

Data uploaded by user via the Hive client with the LOAD DATA LOCAL INPATH
method will have group ownership of the hdfs://tmp/hive-user instead of the
primary group that user belongs to. The group ownership of the
hdfs://tmp/hive-user is, by default, the group that the user running the
hadoop daemons run under. This means that, on a Hadoop system with default
file permissions of 770, any data loaded to hive via the LOAD DATA LOCAL
INPATH method by one user cannot be seen by another user in the same group
until the group ownership is manually changed in Hive's internal directory,
or the group ownership is manually changed on hdfs://tmp/hive-user. This
problem is not present with the LOAD DATA INPATH method, or by using regular
HDFS loads.
Steps to reproduce the problem on a pseudodistributed Hadoop cluster:
- In hdfs-site.xml, modify the umask to 007 (meaning that default permissions
on files are 770). The property changes names in Hadoop 2.0 but used to be
called dfs.umaskmode.
- Restart hdfs
- Create a group called testgroup.
- Create two users that have testgroup as their primary group. Call them
testuser1 and testuser2
- Create a test file containing Hello World and call it test.txt. It
should be stored on the local filesystem.
- Create a table called testtable in Hive using testuser1. Give it a
single string column, textfile format, comma delimited fields.
- Have testuser1 use the LOAD DATA LOCAL INPATH command to load test.txt
into testtable.
- Attempt to read testtable using testuser2. The read will fail on a
permissions error, when it should not.
- Examine the contents of the hdfs://apps/hive/warehouse/testtable directory.
The file will belong to the hadoop or users or analogous group, instead
of the correct group testgroup. It will have correct permissions of 770.
- Change the group ownership of the folder hdfs://tmp/hive-testuser1 to
testgroup.
- Repeat the data load. testuser2 will now be able to correctly read the
data, and the file will have the correct group ownership.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered


[ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938186#comment-13938186
 ] 

Jitendra Nath Pandey commented on HIVE-6518:


Committed to branch-0.13 as well. 

 Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
 triggered
 

 Key: HIVE-6518
 URL: https://issues.apache.org/jira/browse/HIVE-6518
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, 
 HIVE-6518.2.patch, HIVE-6518.3.patch


 The current VectorGroupByOperator implementation flushes the in-memory hashes 
 when the maximum entries or fraction of memory is hit.
 This works for most cases, but there are some corner cases where we hit GC 
 ovehead limits or heap size limits before either of those conditions are 
 reached due to the rest of the pipeline.
 This patch adds a SoftReference as a GC canary. If the soft reference is 
 dead, then a full GC pass happened sometime in the near past  the 
 aggregation hashtables should be flushed immediately before another full GC 
 is triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5687) Streaming support in Hive

2014-03-17 Thread Alan Gates (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938187#comment-13938187
]

Alan Gates commented on HIVE-5687:
--

A few comments:
* Right now you're building one lock request and re-using it. That won't work.
A new lock needs to be constructed with each transaction and then associated
with that transaction so that the transaction manager knows to release the lock
when the transaction is committed or aborted. This should be done in
beginNextTxn().
* The partition name is currently being built incorrectly. It is just using
the values. It should be constructed using Warehouse.makePartName.
* The lock components are being constructed incorrectly. You are building a
component for every key/value pair in the partition. You should only build one
component for each partition you want to lock. So in your case, each lock
request will have exactly one lock component.
* The file is being written to the table location instead of the partition
location. When I run this with table foo and partition bar I get files in
/hive/warehouse/foo instead of /hive/warehouse/foo/bar
* The metaStoreClient is being prematurely closed. It certainly shouldn't be
closed in createPartition. I'm not sure it should be closed at all.

Streaming support in Hive
-

Key: HIVE-5687
URL: https://issues.apache.org/jira/browse/HIVE-5687
Project: Hive
Issue Type: Sub-task
Reporter: Roshan Naik
Assignee: Roshan Naik
Attachments: 5687-api-spec4.pdf, 5687-draft-api-spec.pdf,
5687-draft-api-spec2.pdf, 5687-draft-api-spec3.pdf, HIVE-5687.patch,
HIVE-5687.v2.patch

Implement support for Streaming data into HIVE.
- Provide a client streaming API
- Transaction support: Clients should be able to periodically commit a batch
of records atomically
- Immediate visibility: Records should be immediately visible to queries on
commit
- Should not overload HDFS with too many small files
Use Cases:
- Streaming logs into HIVE via Flume
- Streaming results of computations from Storm

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6489) Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership

2014-03-17 Thread Joe Rao (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joe Rao updated HIVE-6489:
--

Description:
Data uploaded by user via the Hive client with the LOAD DATA LOCAL INPATH
method will have group ownership of the hdfs://tmp/hive-user instead of the
group ownership of the table directory. The group ownership of the
hdfs://tmp/hive-user is, by default, the group that the user running the
hadoop daemons run under. This means that, on a Hadoop system with default
file permissions of 770, any data loaded to hive via the LOAD DATA LOCAL INPATH
method by one user cannot be seen by another user in the same group until the
group ownership is manually changed in Hive's internal directory, or the group
ownership is manually changed on hdfs://tmp/hive-user. This problem is not
present with the LOAD DATA INPATH method, or by using regular HDFS loads.

Steps to reproduce the problem on a pseudodistributed Hadoop cluster:
- In hdfs-site.xml, modify the umask to 007 (meaning that default permissions
on files are 770). The property changes names in Hadoop 2.0 but used to be
called dfs.umaskmode.
- Restart hdfs
- Create a group called testgroup.
- Create two users that have testgroup as their primary group. Call them
testuser1 and testuser2
- Create a test file containing Hello World and call it test.txt. It
should be stored on the local filesystem.
- Create a table called testtable in Hive using testuser1. Give it a single
string column, textfile format, comma delimited fields.
- Have testuser1 use the LOAD DATA LOCAL INPATH command to load test.txt into
testtable.
- Attempt to read testtable using testuser2. The read will fail on a
permissions error, when it should not.
- Examine the contents of the hdfs://apps/hive/warehouse/testtable directory.
The file will belong to the hadoop or users or analogous group, instead of
the correct group testgroup. It will have correct permissions of 770.
- Change the group ownership of the folder hdfs://tmp/hive-testuser1 to
testgroup.
- Repeat the data load. testuser2 will now be able to correctly read the data,
and the file will have the correct group ownership.

was:
Data uploaded by user via the Hive client with the LOAD DATA LOCAL INPATH
method will have group ownership of the hdfs://tmp/hive-user instead of the
primary group that user belongs to. The group ownership of the
hdfs://tmp/hive-user is, by default, the group that the user running the
hadoop daemons run under. This means that, on a Hadoop system with default
file permissions of 770, any data loaded to hive via the LOAD DATA LOCAL INPATH
method by one user cannot be seen by another user in the same group until the
group ownership is manually changed in Hive's internal directory, or the group
ownership is manually changed on hdfs://tmp/hive-user. This problem is not
present with the LOAD DATA INPATH method, or by using regular HDFS loads.

Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership
-

[jira] [Updated] (HIVE-6518) Add a GC canary to the VectorGroupByOperator to flush whenever a GC is triggered


 [ 
https://issues.apache.org/jira/browse/HIVE-6518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6518:
---

Fix Version/s: (was: 0.14.0)
   0.13.0

 Add a GC canary to the VectorGroupByOperator to flush whenever a GC is 
 triggered
 

 Key: HIVE-6518
 URL: https://issues.apache.org/jira/browse/HIVE-6518
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6518.1-tez.patch, HIVE-6518.2-tez.patch, 
 HIVE-6518.2.patch, HIVE-6518.3.patch


 The current VectorGroupByOperator implementation flushes the in-memory hashes 
 when the maximum entries or fraction of memory is hit.
 This works for most cases, but there are some corner cases where we hit GC 
 ovehead limits or heap size limits before either of those conditions are 
 reached due to the rest of the pipeline.
 This patch adds a SoftReference as a GC canary. If the soft reference is 
 dead, then a full GC pass happened sometime in the near past  the 
 aggregation hashtables should be flushed immediately before another full GC 
 is triggered.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.


[ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938194#comment-13938194
 ] 

Jitendra Nath Pandey commented on HIVE-6680:


Ran tests locally. Only failures were show_create_table_serde.q and 
metadata_only_queries_with_filters.q which are unrelated to this patch.

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.


[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938196#comment-13938196
 ] 

Jitendra Nath Pandey commented on HIVE-6664:


Ran tests locally. Only failures were show_create_table_serde.q and 
metadata_only_queries_with_filters.q which are unrelated to this patch.

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.


[ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938195#comment-13938195
 ] 

Jitendra Nath Pandey commented on HIVE-6639:


Ran tests locally. Only failures were show_create_table_serde.q and 
metadata_only_queries_with_filters.q which are unrelated to this patch.

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.


[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938193#comment-13938193
 ] 

Jitendra Nath Pandey commented on HIVE-6649:


Ran tests locally. Only failures were show_create_table_serde.q and 
metadata_only_queries_with_filters.q which are unrelated to this patch.

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, 
 HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6649) Vectorization: some date expressions throw exception.


[ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938218#comment-13938218
 ] 

Jitendra Nath Pandey commented on HIVE-6649:


I have committed this to trunk.

[~rhbutani] This bug affects hive-0.13 and fails several vectorized queries on 
DATE. This should be fixed in branch-0.13 as well.

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, 
 HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization

2014-03-17 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6455:
-

Attachment: HIVE-6455.18.patch

Fix for test failures.

 Scalable dynamic partitioning and bucketing optimization
 

 Key: HIVE-6455
 URL: https://issues.apache.org/jira/browse/HIVE-6455
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: optimization
 Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, 
 HIVE-6455.10.patch, HIVE-6455.10.patch, HIVE-6455.11.patch, 
 HIVE-6455.12.patch, HIVE-6455.13.patch, HIVE-6455.13.patch, 
 HIVE-6455.14.patch, HIVE-6455.15.patch, HIVE-6455.16.patch, 
 HIVE-6455.17.patch, HIVE-6455.17.patch.txt, HIVE-6455.18.patch, 
 HIVE-6455.2.patch, HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch, 
 HIVE-6455.5.patch, HIVE-6455.6.patch, HIVE-6455.7.patch, HIVE-6455.8.patch, 
 HIVE-6455.9.patch, HIVE-6455.9.patch


 The current implementation of dynamic partition works by keeping at least one 
 record writer open per dynamic partition directory. In case of bucketing 
 there can be multispray file writers which further adds up to the number of 
 open record writers. The record writers of column oriented file format (like 
 ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or 
 compression buffers) open all the time to buffer up the rows and compress 
 them before flushing it to disk. Since these buffers are maintained per 
 column basis the amount of constant memory that will required at runtime 
 increases as the number of partitions and number of columns per partition 
 increases. This often leads to OutOfMemory (OOM) exception in mappers or 
 reducers depending on the number of open record writers. Users often tune the 
 JVM heapsize (runtime memory) to get over such OOM issues. 
 With this optimization, the dynamic partition columns and bucketing columns 
 (in case of bucketed tables) are sorted before being fed to the reducers. 
 Since the partitioning and bucketing columns are sorted, each reducers can 
 keep only one record writer open at any time thereby reducing the memory 
 pressure on the reducers. This optimization is highly scalable as the number 
 of partition and number of columns per partition increases at the cost of 
 sorting the columns.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19247: hive.optimize.index.filter breaks non-index where with HBaseStorageHandler

2014-03-17 Thread Ashutosh Chauhan



 On March 15, 2014, 12:19 a.m., Xuefu Zhang wrote:
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java,
   line 275
  https://reviews.apache.org/r/19247/diff/1/?file=520087#file520087line275
 
  Since the problem comes from upper layer, I'm wondering if this should 
  be fixed there instead.
 
 Brock Noland wrote:
 Yeah I think ideally we should support this case. However, it might make 
 sense to commit a temporary fix and resolve the larger issue in a follow-on 
 jira.
 
 Nick, how impactful is this to the hbase use case?

I agree. Proper fix is in higher layer. However, to get this in for 0.13 we can 
have this quick fix and track the proper fix in another jira. 
w.r.t impact I don't see this resulting in reduced functionality or performance 
in any way. 
Nick, can you re-upload the patch so that Hive QA picks it up.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19247/#review37298
---


On March 14, 2014, 11:53 p.m., nick dimiduk wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19247/
 ---
 
 (Updated March 14, 2014, 11:53 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6650
 https://issues.apache.org/jira/browse/HIVE-6650
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 With the above enabled, where clauses including non-rowkey columns cannot be 
 used with the HBaseStorageHandler. See ticket for full details.
 
 
 Diffs
 -
 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
  704fcb9 
   hbase-handler/src/test/queries/positive/hbase_pushdown.q 69a4536 
   hbase-handler/src/test/results/positive/hbase_pushdown.q.out ee9e20c 
 
 Diff: https://reviews.apache.org/r/19247/diff/
 
 
 Testing
 ---
 
 Included patch allows said query to run. Waiting on build bot for test suite.
 
 
 Thanks,
 
 nick dimiduk

Review Request 19319: Fix from deserialization in comments.

2014-03-17 Thread Ashutosh Chauhan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19319/
---

Review request for hive and Gunther Hagleitner.


Bugs: HIVE-6681
https://issues.apache.org/jira/browse/HIVE-6681


Repository: hive-git


Description
---

Fix from deserialization comments.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 56d68f5 
  contrib/src/test/results/clientpositive/fileformat_base64.q.out 3489270 
  hbase-handler/src/test/results/positive/hbase_stats.q.out f05b8ff 
  hbase-handler/src/test/results/positive/hbase_stats2.q.out f05b8ff 
  hbase-handler/src/test/results/positive/hbase_stats3.q.out 0467f8a 
  hbase-handler/src/test/results/positive/hbase_stats_empty_partition.q.out 
6987f64 
  itests/test-serde/src/main/java/org/apache/hadoop/hive/serde2/TestSerDe.java 
23e67e3 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
06c216f 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
cee1637 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 42df435 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 9a6e336 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 45ad315 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
 de04cca 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 73603ab 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java b569ed0 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 98c511e 
  ql/src/test/results/clientnegative/alter_partition_coltype_2columns.q.out 
ed36d5d 
  
ql/src/test/results/clientnegative/alter_partition_coltype_invalidcolname.q.out 
c6913b9 
  ql/src/test/results/clientnegative/alter_partition_coltype_invalidtype.q.out 
740982c 
  ql/src/test/results/clientnegative/alter_view_as_select_with_partition.q.out 
e4c2071 
  ql/src/test/results/clientnegative/desc_failure2.q.out a2e3b36 
  ql/src/test/results/clientnegative/protectmode_part_no_drop.q.out 2f0fabc 
  ql/src/test/results/clientnegative/protectmode_tbl2.q.out e44ae1a 
  ql/src/test/results/clientnegative/protectmode_tbl3.q.out b05a371 
  ql/src/test/results/clientnegative/protectmode_tbl4.q.out 77c2d0e 
  ql/src/test/results/clientnegative/protectmode_tbl5.q.out f078c91 
  ql/src/test/results/clientnegative/protectmode_tbl_no_drop.q.out d451fc7 
  ql/src/test/results/clientnegative/stats_partialscan_autogether.q.out cee6b53 
  ql/src/test/results/clientpositive/alter1.q.out 2b5ddba 
  ql/src/test/results/clientpositive/alter2.q.out 54fa503 
  ql/src/test/results/clientpositive/alter3.q.out f13e0e4 
  ql/src/test/results/clientpositive/alter4.q.out 001b922 
  ql/src/test/results/clientpositive/alter5.q.out d36d40b 
  ql/src/test/results/clientpositive/alter_merge_2.q.out 12c3b09 
  ql/src/test/results/clientpositive/alter_merge_stats.q.out 0dc693d 
  ql/src/test/results/clientpositive/alter_numbuckets_partitioned_table.q.out 
583fc54 
  ql/src/test/results/clientpositive/alter_numbuckets_partitioned_table2.q.out 
b79a4d4 
  ql/src/test/results/clientpositive/alter_partition_clusterby_sortby.q.out 
18dd11c 
  ql/src/test/results/clientpositive/alter_partition_coltype.q.out 30d53cc 
  ql/src/test/results/clientpositive/alter_partition_format_loc.q.out eab0cd8 
  ql/src/test/results/clientpositive/alter_skewed_table.q.out 70e00f4 
  ql/src/test/results/clientpositive/alter_table_not_sorted.q.out 07fcb32 
  ql/src/test/results/clientpositive/alter_table_serde.q.out 2c25f67 
  ql/src/test/results/clientpositive/alter_table_serde2.q.out 347512c 
  ql/src/test/results/clientpositive/alter_view_as_select.q.out 686ceb3 
  ql/src/test/results/clientpositive/alter_view_rename.q.out 0aadc34 
  ql/src/test/results/clientpositive/authorization_7.q.out c9468be 
  ql/src/test/results/clientpositive/authorization_8.q.out 484f179 
  ql/src/test/results/clientpositive/autogen_colalias.q.out 5fe1543 
  ql/src/test/results/clientpositive/ba_table1.q.out 0877dd9 
  ql/src/test/results/clientpositive/ba_table2.q.out 4174287 
  ql/src/test/results/clientpositive/ba_table_union.q.out 8846a01 
  ql/src/test/results/clientpositive/binary_table_bincolserde.q.out bb3c296 
  ql/src/test/results/clientpositive/binary_table_colserde.q.out a8b195f 
  ql/src/test/results/clientpositive/bucket4.q.out 3f21a5a 
  ql/src/test/results/clientpositive/bucket5.q.out c2fc368 
  ql/src/test/results/clientpositive/bucket_groupby.q.out 10dcb6e 
  ql/src/test/results/clientpositive/bucketmapjoin7.q.out dfa1fb2 
  ql/src/test/results/clientpositive/char_nested_types.q.out 8a01d34 
  ql/src/test/results/clientpositive/columnarserde_create_shortcut.q.out 
0a45736 
  ql/src/test/results/clientpositive/combine3.q.out 4fb7930 
  ql/src/test/results/clientpositive/convert_enum_to_string.q.out 3d504c1

[jira] [Commented] (HIVE-6641) optimized HashMap keys won't work correctly with decimals

2014-03-17 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938290#comment-13938290
 ] 

Gunther Hagleitner commented on HIVE-6641:
--

+1 LGTM. However, can you open a follow up jira to properly fix?

Also, are you sure this is the only datatype suffering from this?

 optimized HashMap keys won't work correctly with decimals
 -

 Key: HIVE-6641
 URL: https://issues.apache.org/jira/browse/HIVE-6641
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6641.patch


 Decimal values with can be equal while having different byte representations 
 (different precision/scale), so comparing bytes is not enough. For a quick 
 fix, we can disable this for decimals



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6620) UDF printf doesn't take either CHAR or VARCHAR as the first argument


[ 
https://issues.apache.org/jira/browse/HIVE-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938295#comment-13938295
 ] 

Hive QA commented on HIVE-6620:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635011/HIVE-6620.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5401 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1860/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1860/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635011

 UDF printf doesn't take either CHAR or VARCHAR as the first argument
 

 Key: HIVE-6620
 URL: https://issues.apache.org/jira/browse/HIVE-6620
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6620.1.patch, HIVE-6620.patch, HIVE-6620.patch, 
 HIVE-6620.patch


 {code}
 hive desc vc;
 OK
 c char(5) None
 vcvarchar(7)  None
 s string  None
 hive select printf(c) from vc;
 FAILED: SemanticException [Error 10016]: Line 1:14 Argument type mismatch 
 'c': Argument 1 of function PRINTF must be string, but char(5) was found.
 {code}
 However, if the argument is string type, the query runs successfully.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6620) UDF printf doesn't take either CHAR or VARCHAR as the first argument


[ 
https://issues.apache.org/jira/browse/HIVE-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938351#comment-13938351
 ] 

Xuefu Zhang commented on HIVE-6620:
---

The above test failure is transient and unrelated.

 UDF printf doesn't take either CHAR or VARCHAR as the first argument
 

 Key: HIVE-6620
 URL: https://issues.apache.org/jira/browse/HIVE-6620
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6620.1.patch, HIVE-6620.patch, HIVE-6620.patch, 
 HIVE-6620.patch


 {code}
 hive desc vc;
 OK
 c char(5) None
 vcvarchar(7)  None
 s string  None
 hive select printf(c) from vc;
 FAILED: SemanticException [Error 10016]: Line 1:14 Argument type mismatch 
 'c': Argument 1 of function PRINTF must be string, but char(5) was found.
 {code}
 However, if the argument is string type, the query runs successfully.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6620) UDF printf doesn't take either CHAR or VARCHAR as the first argument


 [ 
https://issues.apache.org/jira/browse/HIVE-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6620:
--

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks to Jason for the review.

 UDF printf doesn't take either CHAR or VARCHAR as the first argument
 

 Key: HIVE-6620
 URL: https://issues.apache.org/jira/browse/HIVE-6620
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.14.0

 Attachments: HIVE-6620.1.patch, HIVE-6620.patch, HIVE-6620.patch, 
 HIVE-6620.patch


 {code}
 hive desc vc;
 OK
 c char(5) None
 vcvarchar(7)  None
 s string  None
 hive select printf(c) from vc;
 FAILED: SemanticException [Error 10016]: Line 1:14 Argument type mismatch 
 'c': Argument 1 of function PRINTF must be string, but char(5) was found.
 {code}
 However, if the argument is string type, the query runs successfully.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6682) nonstaged mapjoin table memory check may be broken


 [ 
https://issues.apache.org/jira/browse/HIVE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6682:
---

Status: Patch Available  (was: Open)

 nonstaged mapjoin table memory check may be broken
 --

 Key: HIVE-6682
 URL: https://issues.apache.org/jira/browse/HIVE-6682
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6682.patch


 We are getting the below error from task while the staged load works 
 correctly. 
 We don't set the memory threshold so low so it seems the settings are just 
 not handled correctly. This seems to always trigger on the first check. Given 
 that map task might have bunch more stuff, not just the hashmap, we may also 
 need to adjust the memory check (e.g. have separate configs).
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82)
   ... 15 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6682) nonstaged mapjoin table memory check may be broken


 [ 
https://issues.apache.org/jira/browse/HIVE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6682:
---

Attachment: HIVE-6682.patch

Fix. Attached q file fails w/o fix for me but passes with fix

 nonstaged mapjoin table memory check may be broken
 --

 Key: HIVE-6682
 URL: https://issues.apache.org/jira/browse/HIVE-6682
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6682.patch


 We are getting the below error from task while the staged load works 
 correctly. 
 We don't set the memory threshold so low so it seems the settings are just 
 not handled correctly. This seems to always trigger on the first check. Given 
 that map task might have bunch more stuff, not just the hashmap, we may also 
 need to adjust the memory check (e.g. have separate configs).
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82)
   ... 15 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6682) nonstaged mapjoin table memory check may be broken


[ 
https://issues.apache.org/jira/browse/HIVE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938358#comment-13938358
 ] 

Sergey Shelukhin commented on HIVE-6682:


[~navis] [~hagleitn] do you guys mind taking a look?

 nonstaged mapjoin table memory check may be broken
 --

 Key: HIVE-6682
 URL: https://issues.apache.org/jira/browse/HIVE-6682
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6682.patch


 We are getting the below error from task while the staged load works 
 correctly. 
 We don't set the memory threshold so low so it seems the settings are just 
 not handled correctly. This seems to always trigger on the first check. Given 
 that map task might have bunch more stuff, not just the hashmap, we may also 
 need to adjust the memory check (e.g. have separate configs).
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82)
   ... 15 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments


 [ 
https://issues.apache.org/jira/browse/HIVE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6685:


Summary: Beeline throws ArrayIndexOutOfBoundsException for mismatched 
arguments  (was: Beeline throws ArrayIndexOutOfBoundsException)

 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
 --

 Key: HIVE-6685
 URL: https://issues.apache.org/jira/browse/HIVE-6685
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho

 Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched 
 arguments in beeline prompt.  It would be nice to cleanup.
 Example:
 {noformat}
 beeline -u szehon -p
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
   at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException

Szehon Ho created HIVE-6685:
---

 Summary: Beeline throws ArrayIndexOutOfBoundsException
 Key: HIVE-6685
 URL: https://issues.apache.org/jira/browse/HIVE-6685
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho


Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched 
arguments in beeline prompt.  It would be nice to cleanup.

Example:
{noformat}
beeline -u szehon -p
Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments


 [ 
https://issues.apache.org/jira/browse/HIVE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6685:


Attachment: HIVE-6685.patch

Attaching a simple fix.  Review board is not responding for me, will try again 
later.

 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
 --

 Key: HIVE-6685
 URL: https://issues.apache.org/jira/browse/HIVE-6685
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6685.patch


 Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched 
 arguments in beeline prompt.  It would be nice to cleanup.
 Example:
 {noformat}
 beeline -u szehon -p
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
   at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments


 [ 
https://issues.apache.org/jira/browse/HIVE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6685:


Status: Patch Available  (was: Open)

 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
 --

 Key: HIVE-6685
 URL: https://issues.apache.org/jira/browse/HIVE-6685
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6685.patch


 Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched 
 arguments in beeline prompt.  It would be nice to cleanup.
 Example:
 {noformat}
 beeline -u szehon -p
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
   at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-6176) Beeline gives bogus error message if an unaccepted command line option is given


 [ 
https://issues.apache.org/jira/browse/HIVE-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-6176.
---

Resolution: Duplicate

 Beeline gives bogus error message if an unaccepted command line option is 
 given
 ---

 Key: HIVE-6176
 URL: https://issues.apache.org/jira/browse/HIVE-6176
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang

 {code}
 $ beeline -o
 -o (No such file or directory)
 Beeline version 0.13.0-SNAPSHOT by Apache Hive
 beeline 
 {code}
 The message suggests that beeline accepts a file (without -f option) while it 
 enters interactive mode any way.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6176) Beeline gives bogus error message if an unaccepted command line option is given


[ 
https://issues.apache.org/jira/browse/HIVE-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938384#comment-13938384
 ] 

Xuefu Zhang commented on HIVE-6176:
---

HIVE-6652 is a dup of this. Since patch is available for that, and there isn't 
much here, this JIRA is closed and HIVE-6652 is used to track the issue.

 Beeline gives bogus error message if an unaccepted command line option is 
 given
 ---

 Key: HIVE-6176
 URL: https://issues.apache.org/jira/browse/HIVE-6176
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang

 {code}
 $ beeline -o
 -o (No such file or directory)
 Beeline version 0.13.0-SNAPSHOT by Apache Hive
 beeline 
 {code}
 The message suggests that beeline accepts a file (without -f option) while it 
 enters interactive mode any way.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6686) webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of log4j.properties file on local filesystem.

2014-03-17 Thread Eugene Koifman (JIRA)

Eugene Koifman created HIVE-6686:


 Summary: webhcat does not honour 
-Dlog4j.configuration=$WEBHCAT_LOG4J of log4j.properties file on local 
filesystem.
 Key: HIVE-6686
 URL: https://issues.apache.org/jira/browse/HIVE-6686
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command


 [ 
https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6578:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

committed to trunk and 13

 Use ORC file footer statistics through StatsProvidingRecordReader interface 
 for analyze command
 ---

 Key: HIVE-6578
 URL: https://issues.apache.org/jira/browse/HIVE-6578
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Fix For: 0.13.0

 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, 
 HIVE-6578.4.patch, HIVE-6578.4.patch.txt


 ORC provides file level statistics which can be used in analyze partialscan 
 and noscan cases to compute basic statistics like number of rows, number of 
 files, total file size and raw data size. On the writer side, a new interface 
 was added earlier (StatsProvidingRecordWriter) that exposed stats when 
 writing a table. Similarly, a new interface StatsProvidingRecordReader can be 
 added which when implemented should provide stats that are gathered by the 
 underlying file format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6570) Hive variable substitution does not work with the source command

2014-03-17 Thread Anthony Hsu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Hsu updated HIVE-6570:
--

Release Note: 
This patch adds Hive variable substitution support to the source command.  
For example, you will now be able to use a statement such as:
source ${hivevar:test-dir}/test.q; 

Added a Release Note explaining the changes in this patch.

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6686) webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of log4j.properties file on local filesystem.

2014-03-17 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6686:
-

Status: Patch Available  (was: Open)

 webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of 
 log4j.properties file on local filesystem.
 --

 Key: HIVE-6686
 URL: https://issues.apache.org/jira/browse/HIVE-6686
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6686.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6664) Vectorized variance computation differs from row mode computation.


[ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938407#comment-13938407
 ] 

Jitendra Nath Pandey commented on HIVE-6664:


I have committed this to trunk.

[~rhbutani] This bug affects hive-0.13 and causes different results than 
row-mode execution. This should be fixed in branch-0.13 as well.


 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name

2014-03-17 Thread Laljo John Pullokkaran (JIRA)

Laljo John Pullokkaran created HIVE-6687:


 Summary: JDBC ResultSet fails to get value by qualified projection 
name
 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1


Getting value from result set using fully qualified name would throw exception. 
Only solution today is to use position of the column as opposed to column label.

String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
ResultSet res = stmt.executeQuery(sql);
res.getInt(r1.x);

Fix is to fix resultsetschema in semantic analyzer.





--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6570) Hive variable substitution does not work with the source command


[ 
https://issues.apache.org/jira/browse/HIVE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938415#comment-13938415
 ] 

Ashutosh Chauhan commented on HIVE-6570:


[~appodictic] Let us know if you have any more feedback.

 Hive variable substitution does not work with the source command
 --

 Key: HIVE-6570
 URL: https://issues.apache.org/jira/browse/HIVE-6570
 Project: Hive
  Issue Type: Bug
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6570.1.patch


 The following does not work:
 {code}
 source ${hivevar:test-dir}/test.q;
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.


[ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938411#comment-13938411
 ] 

Jitendra Nath Pandey commented on HIVE-6680:


Committed to trunk. It is a correctness bug affecting hive-0.13, therefore I 
will port it to hive-13 branch as well.

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name

2014-03-17 Thread Laljo John Pullokkaran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Description: 
Getting value from result set using fully qualified name would throw exception. 
Only solution today is to use position of the column as opposed to column label.

String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
ResultSet res = stmt.executeQuery(sql);
res.getInt(r1.x);

res.getInt(r1.x); would throw exception unknown column even though sql 
specifies it.

Fix is to fix resultsetschema in semantic analyzer.



  was:
Getting value from result set using fully qualified name would throw exception. 
Only solution today is to use position of the column as opposed to column label.

String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
ResultSet res = stmt.executeQuery(sql);
res.getInt(r1.x);

Fix is to fix resultsetschema in semantic analyzer.




 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Review Request 19322: HIVE-6685 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19322/
---

Review request for hive and Xuefu Zhang.


Repository: hive-git


Description
---

Improving the error-handling in ArrayIndexOutOfBoundsException of Beeline.


Diffs
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 3482186 

Diff: https://reviews.apache.org/r/19322/diff/


Testing
---

Manual test.  Now, in this scenario it will display the usage like:

Returns a proper usage message in this scenario, like:

beeline -u
Usage: java org.apache.hive.cli.beeline.BeeLine 
   -u database url   the JDBC URL to connect to
   -n username   the username to connect as
   -p password   the password to connect as
...


Thanks,

Szehon Ho

Re: Review Request 19322: HIVE-6685 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19322/
---

(Updated March 17, 2014, 9:41 p.m.)


Review request for hive and Xuefu Zhang.


Bugs: HIVE-6685
https://issues.apache.org/jira/browse/HIVE-6685


Repository: hive-git


Description
---

Improving the error-handling in ArrayIndexOutOfBoundsException of Beeline.


Diffs
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 3482186 

Diff: https://reviews.apache.org/r/19322/diff/


Testing (updated)
---

Manual test.  Now, in this scenario it will display the usage like:

beeline -u
Usage: java org.apache.hive.cli.beeline.BeeLine 
   -u database url   the JDBC URL to connect to
   -n username   the username to connect as
   -p password   the password to connect as
...


Thanks,

Szehon Ho

Re: Review Request 19322: HIVE-6685 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments

2014-03-17 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19322/#review37478
---



beeline/src/java/org/apache/hive/beeline/BeeLine.java
https://reviews.apache.org/r/19322/#comment69011

I think we should prevent the exception from happening rather than 
capturing it after it happens, if this is possible. Also, I'm not sure if this 
will give enough error message to the user.


- Xuefu Zhang


On March 17, 2014, 9:41 p.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19322/
 ---
 
 (Updated March 17, 2014, 9:41 p.m.)
 
 
 Review request for hive and Xuefu Zhang.
 
 
 Bugs: HIVE-6685
 https://issues.apache.org/jira/browse/HIVE-6685
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Improving the error-handling in ArrayIndexOutOfBoundsException of Beeline.
 
 
 Diffs
 -
 
   beeline/src/java/org/apache/hive/beeline/BeeLine.java 3482186 
 
 Diff: https://reviews.apache.org/r/19322/diff/
 
 
 Testing
 ---
 
 Manual test.  Now, in this scenario it will display the usage like:
 
 beeline -u
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
 ...
 
 
 Thanks,
 
 Szehon Ho

[jira] [Commented] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments


[ 
https://issues.apache.org/jira/browse/HIVE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938433#comment-13938433
 ] 

Xuefu Zhang commented on HIVE-6685:
---

[~szehon] Thanks for working on this. I have some review comments on review 
board for your consideration.

 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
 --

 Key: HIVE-6685
 URL: https://issues.apache.org/jira/browse/HIVE-6685
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6685.patch


 Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched 
 arguments in beeline prompt.  It would be nice to cleanup.
 Example:
 {noformat}
 beeline -u szehon -p
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
   at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6686) webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of log4j.properties file on local filesystem.

2014-03-17 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6686:
-

Attachment: HIVE-6686.patch

no precommit test

 webhcat does not honour -Dlog4j.configuration=$WEBHCAT_LOG4J of 
 log4j.properties file on local filesystem.
 --

 Key: HIVE-6686
 URL: https://issues.apache.org/jira/browse/HIVE-6686
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-6686.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Status: Open  (was: Patch Available)

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.12.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.4.patch, 
 HIVE-6681.5.patch, HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Status: Patch Available  (was: Open)

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.12.0, 0.11.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.4.patch, 
 HIVE-6681.5.patch, HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6681) Describe table sometimes shows from deserializer for column comments


 [ 
https://issues.apache.org/jira/browse/HIVE-6681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6681:
---

Attachment: HIVE-6681.5.patch

 Describe table sometimes shows from deserializer for column comments
 --

 Key: HIVE-6681
 URL: https://issues.apache.org/jira/browse/HIVE-6681
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Serializers/Deserializers
Affects Versions: 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6681.2.patch, HIVE-6681.3.patch, HIVE-6681.4.patch, 
 HIVE-6681.5.patch, HIVE-6681.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19322: HIVE-6685 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments



 On March 17, 2014, 9:44 p.m., Xuefu Zhang wrote:
  beeline/src/java/org/apache/hive/beeline/BeeLine.java, line 573
  https://reviews.apache.org/r/19322/diff/1/?file=524852#file524852line573
 
  I think we should prevent the exception from happening rather than 
  capturing it after it happens, if this is possible. Also, I'm not sure if 
  this will give enough error message to the user.

Thanks Xuefu for looking at this.  Yea thats a good point, but coulnd't think 
of a way to do it cleanly without going into each if statement that references 
args[i++..] and adding a case there.  Is there any way you were thinking?


- Szehon


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19322/#review37478
---


On March 17, 2014, 9:41 p.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19322/
 ---
 
 (Updated March 17, 2014, 9:41 p.m.)
 
 
 Review request for hive and Xuefu Zhang.
 
 
 Bugs: HIVE-6685
 https://issues.apache.org/jira/browse/HIVE-6685
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Improving the error-handling in ArrayIndexOutOfBoundsException of Beeline.
 
 
 Diffs
 -
 
   beeline/src/java/org/apache/hive/beeline/BeeLine.java 3482186 
 
 Diff: https://reviews.apache.org/r/19322/diff/
 
 
 Testing
 ---
 
 Manual test.  Now, in this scenario it will display the usage like:
 
 beeline -u
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
 ...
 
 
 Thanks,
 
 Szehon Ho

[jira] [Resolved] (HIVE-6351) Support Pluggable Authentication Modules for HiveServer2 running in http mode


 [ 
https://issues.apache.org/jira/browse/HIVE-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-6351.


Resolution: Fixed

Changes went in with HIVE-4764

 Support Pluggable Authentication Modules for HiveServer2 running in http mode
 -

 Key: HIVE-6351
 URL: https://issues.apache.org/jira/browse/HIVE-6351
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 Add support Pluggable Authentication Modules for HiveServer2 running in http 
 mode



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-6306) HiveServer2 running in http mode should support for doAs functionality


 [ 
https://issues.apache.org/jira/browse/HIVE-6306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-6306.


Resolution: Fixed

Changes went in with HIVE-4764

 HiveServer2 running in http mode should support for doAs functionality
 --

 Key: HIVE-6306
 URL: https://issues.apache.org/jira/browse/HIVE-6306
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6306.1.patch


 Currently http mode does not support doAs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6318) Document SSL support added to HiveServer2


 [ 
https://issues.apache.org/jira/browse/HIVE-6318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6318:
---

Priority: Major  (was: Blocker)

 Document SSL support added to HiveServer2
 -

 Key: HIVE-6318
 URL: https://issues.apache.org/jira/browse/HIVE-6318
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


 SSL support is/will be added to HiveServer2 running in both binary and http 
 mode, in unsecured auth modes. Need to document the usage and setup.
 Linking relevant jiras.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HIVE-6350) Support LDAP authentication for HiveServer2 in http mode


 [ 
https://issues.apache.org/jira/browse/HIVE-6350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-6350.


Resolution: Fixed

Changes went in with HIVE-4764

 Support LDAP authentication for HiveServer2 in http mode
 

 Key: HIVE-6350
 URL: https://issues.apache.org/jira/browse/HIVE-6350
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6350.1.patch, HIVE-6350.2.patch


 HiveServer2 needs to support LDAP authentication when running in http mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-5768) Beeline connection cannot be closed with !close command


[ 
https://issues.apache.org/jira/browse/HIVE-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938460#comment-13938460
 ] 

Xuefu Zhang commented on HIVE-5768:
---

Sorry for chiming in late, but I think there are two sides of the issue 
underneath: 1. prefix match is used for autocomplete (when tab is pressed) 2. 
exact match for execution. Currently #1 is used for both, which caused the 
problem seen. However, if we switch to #2, then we will lost #1. I tried to the 
patch, and it seems that autocomplete stops working, and additionally, somehow 
I got an exception for !command. There might be a bit more work required for 
this problem.

 Beeline connection cannot be closed with !close command
 ---

 Key: HIVE-5768
 URL: https://issues.apache.org/jira/browse/HIVE-5768
 Project: Hive
  Issue Type: Bug
  Components: CLI
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-5768.1.patch.txt, HIVE-5768.2.patch.txt, 
 HIVE-5768.3.patch.txt, HIVE-5768.4.patch.txt


 NO PRECOMMIT TESTS
 {noformat}
 0: jdbc:hive2://localhost:1/db2 !close
 Ambiguous command: [close, closeall]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19219: Vectorization: Partition column names are not picked up.

2014-03-17 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19219/#review37487
---



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
https://reviews.apache.org/r/19219/#comment69029

Why did this change? no new usage that would prompt that as far as I see, 
and the caller casts needlessly



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
https://reviews.apache.org/r/19219/#comment69027

nit: variable is not needed



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
https://reviews.apache.org/r/19219/#comment69030

maybe there should be a hashset of names


- Sergey Shelukhin


On March 14, 2014, 9:23 a.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19219/
 ---
 
 (Updated March 14, 2014, 9:23 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6639
 https://issues.apache.org/jira/browse/HIVE-6639
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Vectorization: Partition column names are not picked up.
 
 
 Diffs
 -
 
   common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java 
 409a13a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToTimestamp.java
  32386fe 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 c26da37 
 
 Diff: https://reviews.apache.org/r/19219/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey

[jira] [Updated] (HIVE-6649) Vectorization: some date expressions throw exception.


 [ 
https://issues.apache.org/jira/browse/HIVE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6649:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

 Vectorization: some date expressions throw exception.
 -

 Key: HIVE-6649
 URL: https://issues.apache.org/jira/browse/HIVE-6649
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6649.1.patch, HIVE-6649.2.patch, HIVE-6649.2.patch, 
 HIVE-6649.2.patch


 Query ran with hive.vectorized.execution.enabled=true:
 {code}
 select dt, to_date(date_add(dt, 2)), to_date(date_sub(dt, 2)),
datediff(dt, date_add(dt, 2)), datediff(dt, date_sub(dt, 2)),
datediff(date_add(dt, 2), date_sub(dt, 2))
 from vectortab10korc limit 1;
 {code}
 fails with the following error:
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 datediff(date_add(dt, 2), date_sub(dt, 2))
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:117)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
   ... 9 more
 Caused by: java.lang.NullPointerException
   at java.lang.String.checkBounds(String.java:400)
   at java.lang.String.init(String.java:569)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.setDays(VectorUDFDateDiffColCol.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.copySelected(VectorUDFDateDiffColCol.java:231)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.toDateArray(VectorUDFDateDiffColCol.java:190)
   at 
 org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:72)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:115)
   ... 13 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19322: HIVE-6685 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments

2014-03-17 Thread Xuefu Zhang



 On March 17, 2014, 9:44 p.m., Xuefu Zhang wrote:
  beeline/src/java/org/apache/hive/beeline/BeeLine.java, line 573
  https://reviews.apache.org/r/19322/diff/1/?file=524852#file524852line573
 
  I think we should prevent the exception from happening rather than 
  capturing it after it happens, if this is possible. Also, I'm not sure if 
  this will give enough error message to the user.
 
 Szehon Ho wrote:
 Thanks Xuefu for looking at this.  Yea thats a good point, but coulnd't 
 think of a way to do it cleanly without going into each if statement that 
 references args[i++..] and adding a case there.  Is there any way you were 
 thinking?

There are standard commandline option processors. You may also refer to Hive 
CLI, which doesn't seem suffering from the same problem.


- Xuefu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19322/#review37478
---


On March 17, 2014, 9:41 p.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19322/
 ---
 
 (Updated March 17, 2014, 9:41 p.m.)
 
 
 Review request for hive and Xuefu Zhang.
 
 
 Bugs: HIVE-6685
 https://issues.apache.org/jira/browse/HIVE-6685
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Improving the error-handling in ArrayIndexOutOfBoundsException of Beeline.
 
 
 Diffs
 -
 
   beeline/src/java/org/apache/hive/beeline/BeeLine.java 3482186 
 
 Diff: https://reviews.apache.org/r/19322/diff/
 
 
 Testing
 ---
 
 Manual test.  Now, in this scenario it will display the usage like:
 
 beeline -u
 Usage: java org.apache.hive.cli.beeline.BeeLine 
-u database url   the JDBC URL to connect to
-n username   the username to connect as
-p password   the password to connect as
 ...
 
 
 Thanks,
 
 Szehon Ho

[jira] [Updated] (HIVE-6664) Vectorized variance computation differs from row mode computation.


 [ 
https://issues.apache.org/jira/browse/HIVE-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6664:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

 Vectorized variance computation differs from row mode computation.
 --

 Key: HIVE-6664
 URL: https://issues.apache.org/jira/browse/HIVE-6664
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6664.1.patch, HIVE-6664.1.patch, HIVE-6664.1.patch


 Following query can show the difference:
 select  var_samp(ss_sales_price), var_pop(ss_sales_price), 
 stddev_pop(ss_sales_price), stddev_samp(ss_sales_price) from store_sales.
 The reason for the difference is that row mode converts the decimal value to 
 double upfront to calculate sum of values, when computing variance. But the 
 vector mode performs local aggregate sum as decimal and converts into double 
 only at flush.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6680) Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.


 [ 
https://issues.apache.org/jira/browse/HIVE-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6680:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.
 --

 Key: HIVE-6680
 URL: https://issues.apache.org/jira/browse/HIVE-6680
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6680.1.patch, HIVE-6680.1.patch


 Decimal128#update(Decimal128 o, short scale) should adjust the unscaled value.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-17 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938510#comment-13938510
 ] 

Gunther Hagleitner commented on HIVE-6613:
--

[~sseth] second patch seems to be missing the definition of the TezCacheAccess. 
Also if it's not too much trouble a review board link would be nice.

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-6613.2.txt, TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-17 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6613:
-

Status: Open  (was: Patch Available)

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-6613.2.txt, TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-17 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-6613:
-

Attachment: HIVE-6613.3.patch

Updated patch to include the missing file - and renamed to .patch for the 
pre-commit build.

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-6613.2.txt, HIVE-6613.3.patch, TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-17 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-6613:
-

Status: Patch Available  (was: Open)

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-6613.2.txt, HIVE-6613.3.patch, TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-17 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938521#comment-13938521
 ] 

Siddharth Seth commented on HIVE-6613:
--

Review board - https://reviews.apache.org/r/19327/

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-6613.2.txt, HIVE-6613.3.patch, TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Review Request 19322: HIVE-6685 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments