Re: Review Request 19212: HIVE-6645: to_date()/to_unix_timestamp() fail with NPE if input is null

2014-03-18 Thread Mohammad Islam

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19212/#review37540
---

Ship it!


+1

- Mohammad Islam


On March 14, 2014, 9:23 p.m., Jason Dere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19212/
 ---
 
 (Updated March 14, 2014, 9:23 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6645
 https://issues.apache.org/jira/browse/HIVE-6645
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 - fix null inputs
 - allow char/varchar params
 - tests
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java 
 c31174a 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java
  dc259c6 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFDate.java 384ce4e 
   
 ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19212/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jason Dere
 




[jira] [Commented] (HIVE-6660) HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request

2014-03-18 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938884#comment-13938884
 ] 

Prasad Mujumdar commented on HIVE-6660:
---

Patch committed to trunk.

[~rhbutani] This should be a blocker for hive 0.13. Requesting approval to push 
the patch to 0.13 release branch.

 HiveServer2 running in non-http mode closes server socket for an SSL 
 connection after the 1st request
 -

 Key: HIVE-6660
 URL: https://issues.apache.org/jira/browse/HIVE-6660
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Prasad Mujumdar
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6660.1.patch, HIVE-6660.1.patch, hive-site.xml


 *Beeline connection string:*
 {code}
 !connect 
 jdbc:hive2://host:1/;ssl=true;sslTrustStore=/usr/share/doc/hive-0.13.0.2.1.1.0/examples/files/truststore.jks;trustStorePassword=HiveJdbc
  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver 
 {code}
 *Error:*
 {code}
 pool-7-thread-1, handling exception: java.net.SocketTimeoutException: Read 
 timed out
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 pool-7-thread-1, SEND TLSv1 ALERT:  warning, description = close_notify
 Padded plaintext before ENCRYPTION:  len = 32
 : 01 00 BE 72 AC 10 3B FA   4E 01 A5 DE 9B 14 16 AF  ...r..;.N...
 0010: 4E DD 7A 29 AD B4 09 09   09 09 09 09 09 09 09 09  N.z)
 pool-7-thread-1, WRITE: TLSv1 Alert, length = 32
 [Raw write]: length = 37
 : 15 03 01 00 20 6C 37 82   A8 52 40 DA FB 83 2D CD   l7..R@...-.
 0010: 96 9F F0 B7 22 17 E1 04   C1 D1 93 1B C4 39 5A B0  9Z.
 0020: A2 3F 5D 7D 2D .?].-
 pool-7-thread-1, called closeSocket(selfInitiated)
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 {code}
 *Subsequent queries fail:*
 {code}
 main, WRITE: TLSv1 Application Data, length = 144
 main, handling exception: java.net.SocketException: Broken pipe
 %% Invalidated:  [Session-1, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
 main, SEND TLSv1 ALERT:  fatal, description = unexpected_message
 Padded plaintext before ENCRYPTION:  len = 32
 : 02 0A 52 C3 18 B1 C1 38   DB 3F B6 D1 C5 CA 14 9C  ..R8.?..
 0010: A5 38 4C 01 31 69 09 09   09 09 09 09 09 09 09 09  .8L.1i..
 main, WRITE: TLSv1 Alert, length = 32
 main, Exception sending alert: java.net.SocketException: Broken pipe
 main, called closeSocket()
 Error: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe (state=08S01,code=0)
 java.sql.SQLException: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:226)
   at org.apache.hive.beeline.Commands.execute(Commands.java:736)
   at org.apache.hive.beeline.Commands.sql(Commands.java:657)
   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:796)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe
   at 
 org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
   at 
 org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:471)
   at 
 org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37)
   at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:219)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:211)
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:220)
   ... 11 more
 Caused by: java.net.SocketException: Broken pipe
   at java.net.SocketOutputStream.socketWrite0(Native Method)
   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
   at 

Re: Review Request 19329: Make it configurable to have partition columns displayed separately or not.

2014-03-18 Thread Lefty Leverenz


 On March 17, 2014, 11:42 p.m., Jason Dere wrote:
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 837
  https://reviews.apache.org/r/19329/diff/1/?file=525742#file525742line837
 
  I'm sure Lefty will mention this too, I believe new config settings 
  also should have updated entry in conf/hive-default.xml.template.

Yes, but no.  It all depends on when HIVE-6037 gets committed because after 
that hive-default.xml.template will be generated from HiveConf.java, which will 
include descriptions in the parameter definitions.  Anyway, the new patch for 
this jira has a description in hive-default.xml.template so that can go into 
HiveConf.java when the time comes.


- Lefty


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19329/#review37501
---


On March 17, 2014, 11:57 p.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19329/
 ---
 
 (Updated March 17, 2014, 11:57 p.m.)
 
 
 Review request for hive and Jason Dere.
 
 
 Bugs: HIVE-6689
 https://issues.apache.org/jira/browse/HIVE-6689
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Make it configurable to have partition columns displayed separately or not.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b0f5c49 
   conf/hive-default.xml.template a8da2ca 
   
 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
  de04cca 
   
 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
  0c49250 
   ql/src/test/queries/clientpositive/desc_tbl_part_cols.q PRE-CREATION 
   ql/src/test/results/clientpositive/desc_tbl_part_cols.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19329/diff/
 
 
 Testing
 ---
 
 Added a test case.
 
 
 Thanks,
 
 Ashutosh Chauhan
 




[jira] [Commented] (HIVE-6658) Modify Alter_numbuckets* test to reflect hadoop2 changes

2014-03-18 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938916#comment-13938916
 ] 

Szehon Ho commented on HIVE-6658:
-

Yea, looks good, thanks

 Modify Alter_numbuckets* test to reflect hadoop2 changes
 

 Key: HIVE-6658
 URL: https://issues.apache.org/jira/browse/HIVE-6658
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-6658.2.patch


 Hadoop2 now honors number of reducers config while running in local mode. 
 This affects bucketing tests as the data gets properly bucketed in Hadoop2 
 (In hadoop1 all data ended up in same bucket while in local mode).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6692) Location for new table or partition should be a write entity

2014-03-18 Thread Navis (JIRA)
Navis created HIVE-6692:
---

 Summary: Location for new table or partition should be a write 
entity
 Key: HIVE-6692
 URL: https://issues.apache.org/jira/browse/HIVE-6692
 Project: Hive
  Issue Type: Task
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor


Locations for create table and alter table add partitionshould be write 
entities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19329: Make it configurable to have partition columns displayed separately or not.

2014-03-18 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19329/#review37543
---

Ship it!


- Jason Dere


On March 17, 2014, 11:57 p.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/19329/
 ---
 
 (Updated March 17, 2014, 11:57 p.m.)
 
 
 Review request for hive and Jason Dere.
 
 
 Bugs: HIVE-6689
 https://issues.apache.org/jira/browse/HIVE-6689
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Make it configurable to have partition columns displayed separately or not.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b0f5c49 
   conf/hive-default.xml.template a8da2ca 
   
 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java
  de04cca 
   
 ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java
  0c49250 
   ql/src/test/queries/clientpositive/desc_tbl_part_cols.q PRE-CREATION 
   ql/src/test/results/clientpositive/desc_tbl_part_cols.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/19329/diff/
 
 
 Testing
 ---
 
 Added a test case.
 
 
 Thanks,
 
 Ashutosh Chauhan
 




[jira] [Commented] (HIVE-6689) Provide an option to not display partition columns separately in describe table output

2014-03-18 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938925#comment-13938925
 ] 

Jason Dere commented on HIVE-6689:
--

+1

 Provide an option to not display partition columns separately in describe 
 table output 
 ---

 Key: HIVE-6689
 URL: https://issues.apache.org/jira/browse/HIVE-6689
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6689.1.patch, HIVE-6689.patch


 In ancient Hive partition columns were not displayed differently, in newer 
 version they are displayed differently. This has resulted in backward 
 incompatible change for upgrade scenarios. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6692) Location for new table or partition should be a write entity

2014-03-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6692:


Attachment: HIVE-6692.1.patch.txt

 Location for new table or partition should be a write entity
 

 Key: HIVE-6692
 URL: https://issues.apache.org/jira/browse/HIVE-6692
 Project: Hive
  Issue Type: Task
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6692.1.patch.txt


 Locations for create table and alter table add partitionshould be write 
 entities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19344: Location for new table or partition should be a write entity

2014-03-18 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19344/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-6692
https://issues.apache.org/jira/browse/HIVE-6692


Repository: hive-git


Description
---

Locations for create table and alter table add partitionshould be write 
entities.


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 16d7c80 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 7dbb8be 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java 2a38aad 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java 6d7c4f6 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/WriteEntity.java 44a3924 
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java db9fa74 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java e642919 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
92ec334 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 6c53447 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java e1e427f 
  ql/src/test/results/clientnegative/archive_multi7.q.out a8eee2f 
  ql/src/test/results/clientnegative/authorization_droppartition.q.out 1da250a 
  ql/src/test/results/clientnegative/authorization_uri_alterpart_loc.q.out 
39a4e4f 
  ql/src/test/results/clientnegative/authorization_uri_create_table1.q.out 
0b8182a 
  ql/src/test/results/clientnegative/authorization_uri_create_table_ext.q.out 
0b8182a 
  ql/src/test/results/clientnegative/deletejar.q.out 91560ee 
  
ql/src/test/results/clientnegative/exim_20_managed_location_over_existing.q.out 
fd4a418 
  ql/src/test/results/clientnegative/external1.q.out 696beaa 
  ql/src/test/results/clientnegative/external2.q.out a604885 
  ql/src/test/results/clientnegative/insertexternal1.q.out 3df5013 

Diff: https://reviews.apache.org/r/19344/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Updated] (HIVE-6692) Location for new table or partition should be a write entity

2014-03-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6692:


Status: Patch Available  (was: Open)

 Location for new table or partition should be a write entity
 

 Key: HIVE-6692
 URL: https://issues.apache.org/jira/browse/HIVE-6692
 Project: Hive
  Issue Type: Task
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6692.1.patch.txt


 Locations for create table and alter table add partitionshould be write 
 entities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-18 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6222:
---

Attachment: HIVE-6222.4.patch

.4.patch rebased to latest trunk and merges HIVE-6518

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-18 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6222:
---

Status: Open  (was: Patch Available)

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-18 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6222:
---

Status: Patch Available  (was: Open)

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-18 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938967#comment-13938967
 ] 

Remus Rusanu commented on HIVE-6222:


[~gopalv] I've merged the HIVE-6518 fix into the refactoring of 
VectorGroupByOperator, see .4.patch. Everything GCCanary related is moved into 
ProcessingModeHashAggregate class.

 Make Vector Group By operator abandon grouping if too many distinct keys
 

 Key: HIVE-6222
 URL: https://issues.apache.org/jira/browse/HIVE-6222
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: vectorization
 Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, 
 HIVE-6222.4.patch


 Row mode GBY is becoming a pass-through if not enough aggregation occurs on 
 the map side, relying on the shuffle+reduce side to do the work. Have VGBY do 
 the same.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead

2014-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939024#comment-13939024
 ] 

Hive QA commented on HIVE-6430:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635047/HIVE-6430.04.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5417 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1867/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1867/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635047

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
 HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18943: Make Vector Group By operator abandon grouping if too many distinct keys

2014-03-18 Thread Remus Rusanu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18943/
---

(Updated March 18, 2014, 10:09 a.m.)


Review request for hive, Eric Hanson and Jitendra Pandey.


Changes
---

.4.patch


Bugs: HIVE-6222
https://issues.apache.org/jira/browse/HIVE-6222


Repository: hive-git


Description
---

See HIVE-6222


Diffs (updated)
-

  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvg.txt 547a60a 
  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMax.txt dcc1dfb 
  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxDecimal.txt 37ce103 
  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxString.txt 1f8b28c 
  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFSum.txt cb0be33 
  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVar.txt 49b0edd 
  ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt c5af930 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java c4c85fa 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorAggregationBufferRow.java
 7aa4b11 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 
7fb007e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java 
a2a7266 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java
 bd6c24b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorUtilBatchObjectPool.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorAggregateExpression.java
 1836169 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFAvgDecimal.java
 5127107 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFCount.java
 086f91f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFCountStar.java
 4926f6c 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFSumDecimal.java
 0089ad3 

Diff: https://reviews.apache.org/r/18943/diff/


Testing
---

Manually tested. I plan to add test cases in TestVGBy


Thanks,

Remus Rusanu



[jira] [Commented] (HIVE-6468) HS2 out of memory error when curl sends a get request

2014-03-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939085#comment-13939085
 ] 

Navis commented on HIVE-6468:
-

[~leftylev] cannot - must not would be better. You can call it but that 
will make hiveserver die.

 HS2 out of memory error when curl sends a get request
 -

 Key: HIVE-6468
 URL: https://issues.apache.org/jira/browse/HIVE-6468
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
 Environment: Centos 6.3, hive 12, hadoop-2.2
Reporter: Abin Shahab
Assignee: Navis
 Attachments: HIVE-6468.1.patch.txt


 We see an out of memory error when we run simple beeline calls.
 (The hive.server2.transport.mode is binary)
 curl localhost:1
 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
 space
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
   at 
 org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
   at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
   at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
   at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Indexes without Map/Reduce

2014-03-18 Thread Peter Marron
Hi,

Hope that this is to the correct list. Apologies if not.

I am using Hive 0.11.0 and Hadoop 1.0.4.

My goal is to get my Hive queries running without Map/Reduce
but using my custom indexes. To this end I have been building Hive version 13 
from source
and working through the sources to see what I can do.

I can see that the non-M/R path through Hive splits off really early.
I can see that in SemanticAnalyzer.java if it determines that a FetchTask
is sufficient for the query then the genMapRedTasks method returns really
early and it never gets near the code that uses indexes.

I have also followed the code through the index code and I can see that in
IndexWhereProcessor.java an index can insert a index query task
to run before the main query. (By also calling the
queryContext  setIndexInputFormat and setIndexIntermediateFile
methods it can redirect the main query to pick up the data generated by the 
index.)

So I can see two approaches to achieve my goal.


1)  I can modify the FetchTask path to support the use of indexes.


2)  I can allow the query to start down the Map/Reduce path and then

I can arrange for my index code to trash the original query completely and

replace it  with a query that will run as a FetchTask that will do what I want.

Of course there are pros and cons to both of these approaches.


1)  This approach has the advantage that I don't need to change the

current index path at all and so there's much less likely that I will

damage it. However I will probably end up replicating some of the

existing index code, which is not desirable. Also I am not sufficiently

au fait with the Hive code to feel confident that I would make such

a major change in the way that a real Hive developer might.



2)  This approach has the advantage that I am building on top of the

existing index infrastructure and so I probably will end up writing

less code. However it means that my queries will run once as Map

Reduce and again as FetchTasks which will make them slower than

I would like. The approach is also more complicated than I would like.

And I don't really know how cleanly I can abort the initial query and

replace it with a FetchTask. (if, indeed, this is possible.)

Obviously at some point I would like for my changes to get submitted
back into the main Hive source and so I want maximize the chances that
they will be viewed positively.

Does anyone have any opinions or advice to offer?

Regards,

Peter Marron
Senior Developer
Trillium Software, A Harte Hanks Company
Theale Court, 1st Floor, 11-13 High Street
Theale
RG7 5AH
+44 (0) 118 940 7609 office
+44 (0) 118 940 7699 fax
[https://4b2685446389bc779b46-5f66fbb59518cc4fcae8900db28267f5.ssl.cf2.rackcdn.com/trillium.png]http://www.trilliumsoftware.com/
trilliumsoftware.comhttp://www.trilliumsoftware.com/ / 
linkedinhttp://www.linkedin.com/company/17710 / 
twitterhttps://twitter.com/trilliumsw / 
facebookhttp://www.facebook.com/HarteHanks



[jira] [Updated] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.

2014-03-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6668:


Attachment: HIVE-6668.3.patch.txt

 When auto join convert is on and noconditionaltask is off, 
 ConditionalResolverCommonJoin fails to resolve map joins.
 

 Key: HIVE-6668
 URL: https://issues.apache.org/jira/browse/HIVE-6668
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0
Reporter: Yin Huai
Assignee: Navis
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6668.1.patch.txt, HIVE-6668.2.patch.txt, 
 HIVE-6668.3.patch.txt


 I tried the following query today ...
 {code:sql}
 set mapred.job.map.memory.mb=2048;
 set mapred.job.reduce.memory.mb=2048;
 set mapred.map.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.child.java.opts=-server -Xmx3072m 
 -Djava.net.preferIPv4Stack=true;
 set mapred.reduce.tasks=60;
 set hive.stats.autogather=false;
 set hive.exec.parallel=false;
 set hive.enforce.bucketing=true;
 set hive.enforce.sorting=true;
 set hive.map.aggr=true;
 set hive.optimize.bucketmapjoin=true;
 set hive.optimize.bucketmapjoin.sortedmerge=true;
 set hive.mapred.reduce.tasks.speculative.execution=false;
 set hive.auto.convert.join=true;
 set hive.auto.convert.sortmerge.join=true;
 set hive.auto.convert.sortmerge.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask=false;
 set hive.auto.convert.join.noconditionaltask.size=1;
 set hive.optimize.reducededuplication=true;
 set hive.optimize.reducededuplication.min.reducer=1;
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 set hive.mapjoin.smalltable.filesize=4500;
 set hive.optimize.index.filter=false;
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.correlation=false;
 select
i_item_id,
s_state,
avg(ss_quantity) agg1,
avg(ss_list_price) agg2,
avg(ss_coupon_amt) agg3,
avg(ss_sales_price) agg4
 FROM store_sales
 JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
 JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
 JOIN customer_demographics on (store_sales.ss_cdemo_sk = 
 customer_demographics.cd_demo_sk)
 JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
 where
cd_gender = 'F' and
cd_marital_status = 'U' and
cd_education_status = 'Primary' and
d_year = 2002 and
s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
 group by i_item_id, s_state with rollup
 order by
i_item_id,
s_state
 limit 100;
 {code}
 The log shows ...
 {code}
 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve 
 driver alias (threshold : 4500, length mapping : {store=94175, 
 store_sales=48713909726, item=39798667, customer_demographics=1660831, 
 date_dim=2275902})
 Stage-27 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition 
 resolver.
 Stage-28 is filtered out by condition resolver.
 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition 
 resolver.
 Stage-3 is selected by condition resolver.
 {code}
 Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler

2014-03-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939421#comment-13939421
 ] 

Nick Dimiduk commented on HIVE-6650:


Can someone give me some context for this build error?

(cc [~sushanth], [~ashutoshc], [~brocknoland])

 hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
 --

 Key: HIVE-6650
 URL: https://issues.apache.org/jira/browse/HIVE-6650
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch


 With the above enabled, where clauses including non-rowkey columns cannot be 
 used with the HBaseStorageHandler. Job fails to launch with the following 
 exception.
 {noformat}
 java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 
 WEST 56TH STREET')
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292)
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495)
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518)
 at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)
 at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Job Submission failed with exception 'java.lang.RuntimeException(Unexpected 
 residual predicate (s_address = '200 WEST 56TH STREET'))'
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {noformat}
 I believe this bug was introduced in HIVE-2036, see change to 
 OpProcFactory.java that always includes full predicate, even after storage 
 handler negotiates the predicates it can pushdown. Since this behavior is 
 divergent from input formats (they cannot negotiate), there's no harm in the 
 SH ignoring non-indexed predicates -- Hive respects all of them at a layer 
 above anyway. Might as well remove the check/exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler

2014-03-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939447#comment-13939447
 ] 

Ashutosh Chauhan commented on HIVE-6650:


It was not because of patch. Trunk was broken in interim. Its fixed now. Just 
reupload your patch.

 hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
 --

 Key: HIVE-6650
 URL: https://issues.apache.org/jira/browse/HIVE-6650
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch


 With the above enabled, where clauses including non-rowkey columns cannot be 
 used with the HBaseStorageHandler. Job fails to launch with the following 
 exception.
 {noformat}
 java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 
 WEST 56TH STREET')
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292)
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495)
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518)
 at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)
 at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Job Submission failed with exception 'java.lang.RuntimeException(Unexpected 
 residual predicate (s_address = '200 WEST 56TH STREET'))'
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {noformat}
 I believe this bug was introduced in HIVE-2036, see change to 
 OpProcFactory.java that always includes full predicate, even after storage 
 handler negotiates the predicates it can pushdown. Since this behavior is 
 divergent from input formats (they cannot negotiate), there's no harm in the 
 SH ignoring non-indexed predicates -- Hive respects all of them at a layer 
 above anyway. Might as well remove the check/exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler

2014-03-18 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-6650:
---

Attachment: HIVE-6650.2.patch

Same as patch v1.

 hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
 --

 Key: HIVE-6650
 URL: https://issues.apache.org/jira/browse/HIVE-6650
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch, HIVE-6650.2.patch


 With the above enabled, where clauses including non-rowkey columns cannot be 
 used with the HBaseStorageHandler. Job fails to launch with the following 
 exception.
 {noformat}
 java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 
 WEST 56TH STREET')
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292)
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495)
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518)
 at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)
 at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Job Submission failed with exception 'java.lang.RuntimeException(Unexpected 
 residual predicate (s_address = '200 WEST 56TH STREET'))'
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {noformat}
 I believe this bug was introduced in HIVE-2036, see change to 
 OpProcFactory.java that always includes full predicate, even after storage 
 handler negotiates the predicates it can pushdown. Since this behavior is 
 divergent from input formats (they cannot negotiate), there's no harm in the 
 SH ignoring non-indexed predicates -- Hive respects all of them at a layer 
 above anyway. Might as well remove the check/exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6364) HiveServer2 - Request serving thread should get class loader from existing SessionState

2014-03-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939467#comment-13939467
 ] 

Ashutosh Chauhan commented on HIVE-6364:


Although, HIVE-3969 is marked as duplicate, I don't think it is a duplicate. 
This one fixes the problem of having right class loader for a thread serving 
the query, whereas HIVE-3969 talks about unloading registered jars. So, it 
seems there are two independent problem, both of which needs to be fixed.
[~jaideepdhok] would you like to rebase your patch.

 HiveServer2 - Request serving thread should get class loader from existing 
 SessionState
 ---

 Key: HIVE-6364
 URL: https://issues.apache.org/jira/browse/HIVE-6364
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Jaideep Dhok
 Attachments: HIVE-6364.1.patch


 SessionState is created for each session in HS2. If we do any add jars, a 
 class loader is set in the SessionState's conf object. This class loader 
 should also be set in each thread that serves request of the same session.
 Scenario (both requests are in the same session)-
 {noformat}
 // req 1
 add jar foo.jar // Served by thread th1, this updates class loader and sets 
 in SessionState.conf
 // req2 served by th2, such that th1 != th2
 CREATE TEMPORARY FUNCTION foo_udf AS 'some class in foo.jar' 
 // This can throw class not found error, because although 
 // the new thread (th2) gets the same session state as th1,
 // the class loader is different (Thread.currentThread.getContextClassLoader()
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-3969) Session state for hive server should be cleaned-up

2014-03-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939478#comment-13939478
 ] 

Ashutosh Chauhan commented on HIVE-3969:


Although, HIVE-6364 is marked as duplicate, I don't think it is a duplicate. 
This one fixes the problem of unloading registered jars, whereas HIVE-6364 
talks about setting correct class loader for HS2. So, it seems there are two 
independent problem, both of which needs to be fixed.
[~navis] Although, you raised the bug for HS1, I think same exact problem is 
also on HS2. But, I think fix is in same area, so it doesn't really matter.  I 
think we should use sun.misc.ClassLoaderUtil for now and than switch over to 
jdk7 in near future for a clean solution. Would  you like to rebase this patch?

 Session state for hive server should be cleaned-up
 --

 Key: HIVE-3969
 URL: https://issues.apache.org/jira/browse/HIVE-3969
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3969.D8325.1.patch


 Currently add jar command by clients are adding child ClassLoader to worker 
 thread cumulatively, causing various problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler

2014-03-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6650:
---

Status: Open  (was: Patch Available)

 hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
 --

 Key: HIVE-6650
 URL: https://issues.apache.org/jira/browse/HIVE-6650
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch, HIVE-6650.2.patch


 With the above enabled, where clauses including non-rowkey columns cannot be 
 used with the HBaseStorageHandler. Job fails to launch with the following 
 exception.
 {noformat}
 java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 
 WEST 56TH STREET')
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292)
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495)
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518)
 at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)
 at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Job Submission failed with exception 'java.lang.RuntimeException(Unexpected 
 residual predicate (s_address = '200 WEST 56TH STREET'))'
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {noformat}
 I believe this bug was introduced in HIVE-2036, see change to 
 OpProcFactory.java that always includes full predicate, even after storage 
 handler negotiates the predicates it can pushdown. Since this behavior is 
 divergent from input formats (they cannot negotiate), there's no harm in the 
 SH ignoring non-indexed predicates -- Hive respects all of them at a layer 
 above anyway. Might as well remove the check/exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler

2014-03-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6650:
---

Status: Patch Available  (was: Open)

 hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
 --

 Key: HIVE-6650
 URL: https://issues.apache.org/jira/browse/HIVE-6650
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch, HIVE-6650.2.patch


 With the above enabled, where clauses including non-rowkey columns cannot be 
 used with the HBaseStorageHandler. Job fails to launch with the following 
 exception.
 {noformat}
 java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 
 WEST 56TH STREET')
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292)
 at 
 org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495)
 at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518)
 at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510)
 at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
 at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
 at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
 at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425)
 at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Job Submission failed with exception 'java.lang.RuntimeException(Unexpected 
 residual predicate (s_address = '200 WEST 56TH STREET'))'
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask
 {noformat}
 I believe this bug was introduced in HIVE-2036, see change to 
 OpProcFactory.java that always includes full predicate, even after storage 
 handler negotiates the predicates it can pushdown. Since this behavior is 
 divergent from input formats (they cannot negotiate), there's no harm in the 
 SH ignoring non-indexed predicates -- Hive respects all of them at a layer 
 above anyway. Might as well remove the check/exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6693) CASE with INT and BIGINT fail

2014-03-18 Thread David Gayou (JIRA)
David Gayou created HIVE-6693:
-

 Summary: CASE with INT and BIGINT fail
 Key: HIVE-6693
 URL: https://issues.apache.org/jira/browse/HIVE-6693
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.12.0
Reporter: David Gayou


CREATE TABLE testCase (n BIGINT)

select case when (n 3) then n else 0 end from testCase

fail with error : 
[Error 10016]: Line 1:36 Argument type mismatch '0': The expression after ELSE 
should have the same type as those after THEN: bigint is expected but int 
is found'.

bigint and int should be more compatible, at least int should implictly cast to 
bigint. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name

2014-03-18 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6687:
--

Description: 
Getting value from result set using fully qualified name would throw exception. 
Only solution today is to use position of the column as opposed to column label.
{code}
String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
ResultSet res = stmt.executeQuery(sql);
res.getInt(r1.x);
{code}
res.getInt(r1.x); would throw exception unknown column even though sql 
specifies it.

Fix is to fix resultsetschema in semantic analyzer.



  was:
Getting value from result set using fully qualified name would throw exception. 
Only solution today is to use position of the column as opposed to column label.

String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
ResultSet res = stmt.executeQuery(sql);
res.getInt(r1.x);

res.getInt(r1.x); would throw exception unknown column even though sql 
specifies it.

Fix is to fix resultsetschema in semantic analyzer.




 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6625) HiveServer2 running in http mode should support trusted proxy access

2014-03-18 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939542#comment-13939542
 ] 

Harish Butani commented on HIVE-6625:
-

+1 for 0.13

 HiveServer2 running in http mode should support trusted proxy access
 

 Key: HIVE-6625
 URL: https://issues.apache.org/jira/browse/HIVE-6625
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6625.1.patch, HIVE-6625.2.patch


 HIVE-5155 adds trusted proxy access to HiveServer2. This patch a minor change 
 to have it used when running HiveServer2 in http mode. Patch to be applied on 
 top of HIVE-4764  HIVE-5155.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6660) HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request

2014-03-18 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939539#comment-13939539
 ] 

Harish Butani commented on HIVE-6660:
-

+1 for porting to 0.13 branch

 HiveServer2 running in non-http mode closes server socket for an SSL 
 connection after the 1st request
 -

 Key: HIVE-6660
 URL: https://issues.apache.org/jira/browse/HIVE-6660
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Prasad Mujumdar
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6660.1.patch, HIVE-6660.1.patch, hive-site.xml


 *Beeline connection string:*
 {code}
 !connect 
 jdbc:hive2://host:1/;ssl=true;sslTrustStore=/usr/share/doc/hive-0.13.0.2.1.1.0/examples/files/truststore.jks;trustStorePassword=HiveJdbc
  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver 
 {code}
 *Error:*
 {code}
 pool-7-thread-1, handling exception: java.net.SocketTimeoutException: Read 
 timed out
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 pool-7-thread-1, SEND TLSv1 ALERT:  warning, description = close_notify
 Padded plaintext before ENCRYPTION:  len = 32
 : 01 00 BE 72 AC 10 3B FA   4E 01 A5 DE 9B 14 16 AF  ...r..;.N...
 0010: 4E DD 7A 29 AD B4 09 09   09 09 09 09 09 09 09 09  N.z)
 pool-7-thread-1, WRITE: TLSv1 Alert, length = 32
 [Raw write]: length = 37
 : 15 03 01 00 20 6C 37 82   A8 52 40 DA FB 83 2D CD   l7..R@...-.
 0010: 96 9F F0 B7 22 17 E1 04   C1 D1 93 1B C4 39 5A B0  9Z.
 0020: A2 3F 5D 7D 2D .?].-
 pool-7-thread-1, called closeSocket(selfInitiated)
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 {code}
 *Subsequent queries fail:*
 {code}
 main, WRITE: TLSv1 Application Data, length = 144
 main, handling exception: java.net.SocketException: Broken pipe
 %% Invalidated:  [Session-1, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
 main, SEND TLSv1 ALERT:  fatal, description = unexpected_message
 Padded plaintext before ENCRYPTION:  len = 32
 : 02 0A 52 C3 18 B1 C1 38   DB 3F B6 D1 C5 CA 14 9C  ..R8.?..
 0010: A5 38 4C 01 31 69 09 09   09 09 09 09 09 09 09 09  .8L.1i..
 main, WRITE: TLSv1 Alert, length = 32
 main, Exception sending alert: java.net.SocketException: Broken pipe
 main, called closeSocket()
 Error: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe (state=08S01,code=0)
 java.sql.SQLException: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:226)
   at org.apache.hive.beeline.Commands.execute(Commands.java:736)
   at org.apache.hive.beeline.Commands.sql(Commands.java:657)
   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:796)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe
   at 
 org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
   at 
 org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:471)
   at 
 org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37)
   at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:219)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:211)
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:220)
   ... 11 more
 Caused by: java.net.SocketException: Broken pipe
   at java.net.SocketOutputStream.socketWrite0(Native Method)
   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
   at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
   at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:377)
   at 

[jira] [Commented] (HIVE-6682) nonstaged mapjoin table memory check may be broken

2014-03-18 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939581#comment-13939581
 ] 

Sergey Shelukhin commented on HIVE-6682:


https://reviews.apache.org/r/19363/ although the patch is really small, it's 
just the q file and result.
What do you mean by the question? That is what is done, right? I added config 
to make sure it's set, because if it's not the job is going to fail on any real 
data

 nonstaged mapjoin table memory check may be broken
 --

 Key: HIVE-6682
 URL: https://issues.apache.org/jira/browse/HIVE-6682
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6682.patch


 We are getting the below error from task while the staged load works 
 correctly. 
 We don't set the memory threshold so low so it seems the settings are just 
 not handled correctly. This seems to always trigger on the first check. Given 
 that map task might have bunch more stuff, not just the hashmap, we may also 
 need to adjust the memory check (e.g. have separate configs).
 {noformat}
 Error: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21 Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   ... 8 more
 Caused by: 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
 2014-03-14 08:11:21   Processing rows:20  Hashtable size: 
 19  Memory usage:   204001888   percentage: 0.197
   at 
 org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147)
   at 
 org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82)
   ... 15 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6643) Add a check for cross products in plans and output a warning

2014-03-18 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6643:


Status: Open  (was: Patch Available)

 Add a check for cross products in plans and output a warning
 

 Key: HIVE-6643
 URL: https://issues.apache.org/jira/browse/HIVE-6643
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6643.1.patch, HIVE-6643.2.patch


 Now that we support old style join syntax, it is easy to write queries that 
 generate a plan with a cross product.
 For e.g. say you have A join B join C join D on A.x = B.x and A.y = D.y and 
 C.z = D.z
 So the JoinTree is:
 A — B
 |__  D — C
 Since we don't reorder join graphs, we will end up with a cross product 
 between (A join B) and C



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6643) Add a check for cross products in plans and output a warning

2014-03-18 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6643:


Status: Patch Available  (was: Open)

 Add a check for cross products in plans and output a warning
 

 Key: HIVE-6643
 URL: https://issues.apache.org/jira/browse/HIVE-6643
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
 Attachments: HIVE-6643.1.patch, HIVE-6643.2.patch


 Now that we support old style join syntax, it is easy to write queries that 
 generate a plan with a cross product.
 For e.g. say you have A join B join C join D on A.x = B.x and A.y = D.y and 
 C.z = D.z
 So the JoinTree is:
 A — B
 |__  D — C
 Since we don't reorder join graphs, we will end up with a cross product 
 between (A join B) and C



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6364) HiveServer2 - Request serving thread should get class loader from existing SessionState

2014-03-18 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939592#comment-13939592
 ] 

Jason Dere commented on HIVE-6364:
--

Hi Jaideep, when I tried debugging hiveserver2 due to HIVE-6672, it appeared 
that there was a thread running for each connection (session).  Non-SQL 
commands (such as ADD JAR), were being run within this session thread and so 
the classloader for the session thread had the JARs loaded.  When a SQL command 
was executed the session thread would start a new thread, and it appeared that 
this new thread was using the same classloader (and had the added JARs in the 
classloader's list of URLs). Were you seeing different behavior  in your 
testing (I was running this on Mac, I think with jdk 1.6, not sure if it would 
have been different)?

In the patch, the thread's classloader is getting set to the HiveConf's 
classloader .. where is the HiveConf's classloader getting set from? Do we need 
to worry about having to make sure this classloader is updated whenever a JAR 
is added to the classpath?

 HiveServer2 - Request serving thread should get class loader from existing 
 SessionState
 ---

 Key: HIVE-6364
 URL: https://issues.apache.org/jira/browse/HIVE-6364
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Jaideep Dhok
 Attachments: HIVE-6364.1.patch


 SessionState is created for each session in HS2. If we do any add jars, a 
 class loader is set in the SessionState's conf object. This class loader 
 should also be set in each thread that serves request of the same session.
 Scenario (both requests are in the same session)-
 {noformat}
 // req 1
 add jar foo.jar // Served by thread th1, this updates class loader and sets 
 in SessionState.conf
 // req2 served by th2, such that th1 != th2
 CREATE TEMPORARY FUNCTION foo_udf AS 'some class in foo.jar' 
 // This can throw class not found error, because although 
 // the new thread (th2) gets the same session state as th1,
 // the class loader is different (Thread.currentThread.getContextClassLoader()
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-18 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939608#comment-13939608
 ] 

Jitendra Nath Pandey commented on HIVE-6639:


I have committed this to trunk.

[~rhbutani] This bug affects hive-0.13 and fails queries having partitioned 
columns but no filters. This should be fixed in branch-0.13 as well.


 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-18 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939613#comment-13939613
 ] 

Harish Butani commented on HIVE-6639:
-

+1 for 0.13

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6641) optimized HashMap keys won't work correctly with decimals

2014-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6641:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

in trunk and 13

 optimized HashMap keys won't work correctly with decimals
 -

 Key: HIVE-6641
 URL: https://issues.apache.org/jira/browse/HIVE-6641
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.13.0

 Attachments: HIVE-6641.patch


 Decimal values with can be equal while having different byte representations 
 (different precision/scale), so comparing bytes is not enough. For a quick 
 fix, we can disable this for decimals



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6694) Beeline should provide a way to execute shell command as Hive CLI does

2014-03-18 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created HIVE-6694:
-

 Summary: Beeline should provide a way to execute shell command as 
Hive CLI does
 Key: HIVE-6694
 URL: https://issues.apache.org/jira/browse/HIVE-6694
 Project: Hive
  Issue Type: Improvement
  Components: CLI, Clients
Affects Versions: 0.12.0, 0.11.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Hive CLI allows a user to execute a shell command using ! notation. For 
instance, !cat myfile.txt. Being able to execute shell command may be important 
for some users. As a replacement, however, Beeline provides no such capability, 
possibly because ! notation is reserved for SQLLine commands. It's possible to 
provide this using a slightly syntactic variation such as !sh cat myfilie.txt.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

I have committed this to trunk and branch-0.13.

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.

2014-03-18 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6639:
---

Affects Version/s: 0.13.0

 Vectorization: Partition column names are not picked up.
 

 Key: HIVE-6639
 URL: https://issues.apache.org/jira/browse/HIVE-6639
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.13.0

 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, 
 HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch


 The vectorized plan generation finds the list of partitioning columns from 
 pruned-partition-list using table scan operator. In some cases the list is 
 coming as null. TPCDS query 27 can reproduce this issue if the store_sales 
 table is partitioned on ss_store_sk. The exception stacktrace is :
 {code}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116)
   ... 42 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6690) NPE in tez session state

2014-03-18 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6690:
-

Priority: Critical  (was: Major)

 NPE in tez session state
 

 Key: HIVE-6690
 URL: https://issues.apache.org/jira/browse/HIVE-6690
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Fix For: 0.13.0

 Attachments: HIVE-6690.patch


 If hive.jar.directory isn't set hive will throw NPE in startup with tez:
 Exception in thread main java.lang.RuntimeException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:344)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:682)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createHiveExecLocalResource(TezSessionState.java:303)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:130)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:342)
 ... 7 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6690) NPE in tez session state

2014-03-18 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6690:
-

Fix Version/s: 0.13.0

 NPE in tez session state
 

 Key: HIVE-6690
 URL: https://issues.apache.org/jira/browse/HIVE-6690
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Critical
 Fix For: 0.13.0

 Attachments: HIVE-6690.patch


 If hive.jar.directory isn't set hive will throw NPE in startup with tez:
 Exception in thread main java.lang.RuntimeException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:344)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:682)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createHiveExecLocalResource(TezSessionState.java:303)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:130)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:342)
 ... 7 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6690) NPE in tez session state

2014-03-18 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939769#comment-13939769
 ] 

Gunther Hagleitner commented on HIVE-6690:
--

+1 LGTM

 NPE in tez session state
 

 Key: HIVE-6690
 URL: https://issues.apache.org/jira/browse/HIVE-6690
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.13.0

 Attachments: HIVE-6690.patch


 If hive.jar.directory isn't set hive will throw NPE in startup with tez:
 Exception in thread main java.lang.RuntimeException: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:344)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:682)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createHiveExecLocalResource(TezSessionState.java:303)
 at 
 org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:130)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:342)
 ... 7 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6331) HIVE-5279 deprecated UDAF class without explanation/documentation/alternative

2014-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939793#comment-13939793
 ] 

Hive QA commented on HIVE-6331:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635056/HIVE-6331.2.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5411 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1868/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1868/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635056

 HIVE-5279 deprecated UDAF class without explanation/documentation/alternative
 -

 Key: HIVE-6331
 URL: https://issues.apache.org/jira/browse/HIVE-6331
 Project: Hive
  Issue Type: Bug
Reporter: Lars Francke
Assignee: Lars Francke
Priority: Minor
 Attachments: HIVE-5279.1.patch, HIVE-6331.2.patch


 HIVE-5279 added a @Deprecated annotation to the {{UDAF}} class. The comment 
 in that class says {quote}UDAF classes are REQUIRED to inherit from this 
 class.{quote}
 One of these two needs to be updated. Either remove the annotation or 
 document why it was deprecated and what to use instead.
 Unfortunately [~navis] did not leave any documentation about his intentions.
 I'm happy to provide a patch once I know the intentions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD

2014-03-18 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-4293:


Attachment: HIVE-4293.13.patch

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.10.patch, 
 HIVE-4293.11.patch.txt, HIVE-4293.12.patch, HIVE-4293.13.patch, 
 HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, 
 HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, 
 HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD

2014-03-18 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-4293:


Status: Patch Available  (was: Open)

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.10.patch, 
 HIVE-4293.11.patch.txt, HIVE-4293.12.patch, HIVE-4293.13.patch, 
 HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, 
 HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, 
 HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD

2014-03-18 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-4293:


Status: Open  (was: Patch Available)

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.10.patch, 
 HIVE-4293.11.patch.txt, HIVE-4293.12.patch, HIVE-4293.13.patch, 
 HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, 
 HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, 
 HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4293) Predicates following UDTF operator are removed by PPD

2014-03-18 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939816#comment-13939816
 ] 

Harish Butani commented on HIVE-4293:
-

ran tests locally, sq_notin_having.q.out has changed.

 Predicates following UDTF operator are removed by PPD
 -

 Key: HIVE-4293
 URL: https://issues.apache.org/jira/browse/HIVE-4293
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: D9933.6.patch, HIVE-4293.10.patch, 
 HIVE-4293.11.patch.txt, HIVE-4293.12.patch, HIVE-4293.13.patch, 
 HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, 
 HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, 
 HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch


 For example, 
 {noformat}
 explain SELECT value from (
   select explode(array(key, value)) as (value) from (
 select * FROM src WHERE key  200
   ) A
 ) B WHERE value  300
 ;
 {noformat}
 Makes plan like this, removing last predicates
 {noformat}
   TableScan
 alias: src
 Filter Operator
   predicate:
   expr: (key  200.0)
   type: boolean
   Select Operator
 expressions:
   expr: array(key,value)
   type: arraystring
 outputColumnNames: _col0
 UDTF Operator
   function name: explode
   Select Operator
 expressions:
   expr: col
   type: string
 outputColumnNames: _col0
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19373: Limit table partitions involved in a table scan

2014-03-18 Thread Selina Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19373/
---

Review request for hive and Gunther Hagleitner.


Bugs: HIVE-6492
https://issues.apache.org/jira/browse/HIVE-6492


Repository: hive-git


Description
---

Introduce a new configure parameter to limit the table partitions involved in a 
table scan. It applies to select * query and any queries need issue MR jobs. 


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java edc3d38 
  ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java ecd4c5d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ecce21e 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java
 7f2bb60 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 73603ab 
  ql/src/test/queries/clientnegative/limit_partition.q PRE-CREATION 
  ql/src/test/queries/clientnegative/limit_partition_stats.q PRE-CREATION 
  ql/src/test/queries/clientpositive/limit_partition_metadataonly.q 
PRE-CREATION 
  ql/src/test/results/clientnegative/limit_partition.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/limit_partition_stats.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/limit_partition_metadataonly.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/19373/diff/


Testing
---

3 tests are added


Thanks,

Selina Zhang



[jira] [Updated] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [CLONE of HCATALOG-621

2014-03-18 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6695:
---

Description: 
This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
seems to be in read-only mode. To quote Nick from the original bug:

I'm not sure how this fixes anything for the error listed above. The find 
command in the script we merged is broken, at least on linux. Maybe it worked 
with BSD find and we both tested on Macs?

From the patch we committed:
{noformat}
if [ -d ${HBASE_HOME} ] ; then
   for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
  HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
   done
   export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
fi
{noformat}
The find command syntax is wrong – it returns no jars ever.

{noformat}
$ find /usr/lib/hbase -name *.jar
$ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
$
{noformat}

What we need is more like:

{noformat}
$ find /usr/lib/hbase -name '*.jar'
... // prints lots of jars
$ find /usr/lib/hbase -name '*.jar' | grep thrift
/usr/lib/hbase/lib/libthrift-0.9.0.jar
$ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
$
{noformat}


  was:
This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
seems to be in read-only mode. To quote Nick from the original bug:

I'm not sure how this fixes anything for the error listed above. The find 
command in the script we merged is broken, at least on linux. Maybe it worked 
with BSD find and we both tested on Macs?

From the patch we committed:
{noformat}
if [ -d ${HBASE_HOME} ] ; then
   for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
  HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
   done
   export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
fi
{noformat}
The find command syntax is wrong – it returns no jars ever.

{noformat}
$ find /usr/lib/hbase -name *.jar
$ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
$
What we need is more like:
$ find /usr/lib/hbase -name '*.jar'
... // prints lots of jars
$ find /usr/lib/hbase -name '*.jar' | grep thrift
/usr/lib/hbase/lib/libthrift-0.9.0.jar
$ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
$
{noformat}



 bin/hcat should include hbase jar and dependencies in the classpath [CLONE of 
 HCATALOG-621
 --

 Key: HIVE-6695
 URL: https://issues.apache.org/jira/browse/HIVE-6695
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Nick Dimiduk

 This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
 seems to be in read-only mode. To quote Nick from the original bug:
 I'm not sure how this fixes anything for the error listed above. The find 
 command in the script we merged is broken, at least on linux. Maybe it worked 
 with BSD find and we both tested on Macs?
 From the patch we committed:
 {noformat}
 if [ -d ${HBASE_HOME} ] ; then
for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
   HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
done
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
 fi
 {noformat}
 The find command syntax is wrong – it returns no jars ever.
 {noformat}
 $ find /usr/lib/hbase -name *.jar
 $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
 $
 {noformat}
 What we need is more like:
 {noformat}
 $ find /usr/lib/hbase -name '*.jar'
 ... // prints lots of jars
 $ find /usr/lib/hbase -name '*.jar' | grep thrift
 /usr/lib/hbase/lib/libthrift-0.9.0.jar
 $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
 $
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [CLONE of HCATALOG-621

2014-03-18 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-6695:
--

 Summary: bin/hcat should include hbase jar and dependencies in the 
classpath [CLONE of HCATALOG-621
 Key: HIVE-6695
 URL: https://issues.apache.org/jira/browse/HIVE-6695
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Nick Dimiduk


This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
seems to be in read-only mode. To quote Nick from the original bug:

I'm not sure how this fixes anything for the error listed above. The find 
command in the script we merged is broken, at least on linux. Maybe it worked 
with BSD find and we both tested on Macs?

From the patch we committed:
{noformat}
if [ -d ${HBASE_HOME} ] ; then
   for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
  HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
   done
   export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
fi
{noformat}
The find command syntax is wrong – it returns no jars ever.

{noformat}
$ find /usr/lib/hbase -name *.jar
$ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
$
What we need is more like:
$ find /usr/lib/hbase -name '*.jar'
... // prints lots of jars
$ find /usr/lib/hbase -name '*.jar' | grep thrift
/usr/lib/hbase/lib/libthrift-0.9.0.jar
$ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
$
{noformat}




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6613) Control when spcific Inputs / Outputs are started

2014-03-18 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939919#comment-13939919
 ] 

Gunther Hagleitner commented on HIVE-6613:
--

+1 LGTM

 Control when spcific Inputs / Outputs are started
 -

 Key: HIVE-6613
 URL: https://issues.apache.org/jira/browse/HIVE-6613
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: HIVE-6613.2.txt, HIVE-6613.3.patch, TEZ-6613.1.txt


 When running with Tez - a couple of enhancement are possible
 1) Avoid re-fetching data in case of MapJoins - since the data is likely to 
 be cached after the first run (container re-use for the same query)
 2) Start Outputs only after required Inputs are ready - specifically useful 
 in case of Reduce - where shuffle requires a large memory, and the Output (if 
 it's a sorted output) also requires a fair amount of memory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]

2014-03-18 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6695:
---

Summary: bin/hcat should include hbase jar and dependencies in the 
classpath [followup/clone of HCATALOG-621]  (was: bin/hcat should include hbase 
jar and dependencies in the classpath [CLONE of HCATALOG-621)

 bin/hcat should include hbase jar and dependencies in the classpath 
 [followup/clone of HCATALOG-621]
 

 Key: HIVE-6695
 URL: https://issues.apache.org/jira/browse/HIVE-6695
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Nick Dimiduk

 This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
 seems to be in read-only mode. To quote Nick from the original bug:
 I'm not sure how this fixes anything for the error listed above. The find 
 command in the script we merged is broken, at least on linux. Maybe it worked 
 with BSD find and we both tested on Macs?
 From the patch we committed:
 {noformat}
 if [ -d ${HBASE_HOME} ] ; then
for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
   HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
done
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
 fi
 {noformat}
 The find command syntax is wrong – it returns no jars ever.
 {noformat}
 $ find /usr/lib/hbase -name *.jar
 $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
 $
 {noformat}
 What we need is more like:
 {noformat}
 $ find /usr/lib/hbase -name '*.jar'
 ... // prints lots of jars
 $ find /usr/lib/hbase -name '*.jar' | grep thrift
 /usr/lib/hbase/lib/libthrift-0.9.0.jar
 $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
 $
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]

2014-03-18 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6695:
---

Attachment: HIVE-6695.patch

(Attaching addendum patch that was uploaded to HCATALOG-621)

 bin/hcat should include hbase jar and dependencies in the classpath 
 [followup/clone of HCATALOG-621]
 

 Key: HIVE-6695
 URL: https://issues.apache.org/jira/browse/HIVE-6695
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Nick Dimiduk
 Attachments: HIVE-6695.patch


 This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
 seems to be in read-only mode. To quote Nick from the original bug:
 I'm not sure how this fixes anything for the error listed above. The find 
 command in the script we merged is broken, at least on linux. Maybe it worked 
 with BSD find and we both tested on Macs?
 From the patch we committed:
 {noformat}
 if [ -d ${HBASE_HOME} ] ; then
for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
   HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
done
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
 fi
 {noformat}
 The find command syntax is wrong – it returns no jars ever.
 {noformat}
 $ find /usr/lib/hbase -name *.jar
 $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
 $
 {noformat}
 What we need is more like:
 {noformat}
 $ find /usr/lib/hbase -name '*.jar'
 ... // prints lots of jars
 $ find /usr/lib/hbase -name '*.jar' | grep thrift
 /usr/lib/hbase/lib/libthrift-0.9.0.jar
 $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
 $
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6468) HS2 out of memory error when curl sends a get request

2014-03-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939925#comment-13939925
 ] 

Lefty Leverenz commented on HIVE-6468:
--

Thanks, I put it in a warning box with this wording:  In remote mode 
HiveServer2 only accepts valid Thrift calls – do not attempt to call it via 
http or telnet (HIVE-6468).

Should it also explain that HS2 will die, or is that just until this jira's 
patch gets added?  Readers can click the link to this jira if they want to know 
the reason for the warning, but we could make it explicit if you think that's 
better.

By the way, *hive.server2.sasl.message.limit* needs some user doc.  It can go 
in a HiveConf.java comment for now, or in a release note, until we know when 
HIVE-6037 will get committed.

Quick ref:

* [new warning in Beeline – New Command Line Shell 
|https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Beeline–NewCommandLineShell]
* [page history:  new changes 
|https://cwiki.apache.org/confluence/pages/diffpages.action?pageId=30758725originalId=40505296]

 HS2 out of memory error when curl sends a get request
 -

 Key: HIVE-6468
 URL: https://issues.apache.org/jira/browse/HIVE-6468
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
 Environment: Centos 6.3, hive 12, hadoop-2.2
Reporter: Abin Shahab
Assignee: Navis
 Attachments: HIVE-6468.1.patch.txt


 We see an out of memory error when we run simple beeline calls.
 (The hive.server2.transport.mode is binary)
 curl localhost:1
 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
 space
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
   at 
 org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
   at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
   at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
   at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]

2014-03-18 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939932#comment-13939932
 ] 

Sushanth Sowmyan commented on HIVE-6695:


[~ndimiduk], I created this hive jira since I was not able to respond on 
HCATALOG-621, since that seems like it's been locked down.


+1 to the change, I'll go ahead and commit it.

I've experimented with both versions of the find command, and both work for me 
(with and without quotes, and in fact, I'm more used to the backslash 
notation). I'm using findutils-4.4.2-6.el6.x86_64. The main difference though, 
was that our offending jar is libthrift\*jar, not thrift\*jar.


 bin/hcat should include hbase jar and dependencies in the classpath 
 [followup/clone of HCATALOG-621]
 

 Key: HIVE-6695
 URL: https://issues.apache.org/jira/browse/HIVE-6695
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Nick Dimiduk
 Attachments: HIVE-6695.patch


 This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
 seems to be in read-only mode. To quote Nick from the original bug:
 I'm not sure how this fixes anything for the error listed above. The find 
 command in the script we merged is broken, at least on linux. Maybe it worked 
 with BSD find and we both tested on Macs?
 From the patch we committed:
 {noformat}
 if [ -d ${HBASE_HOME} ] ; then
for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
   HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
done
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
 fi
 {noformat}
 The find command syntax is wrong – it returns no jars ever.
 {noformat}
 $ find /usr/lib/hbase -name *.jar
 $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
 $
 {noformat}
 What we need is more like:
 {noformat}
 $ find /usr/lib/hbase -name '*.jar'
 ... // prints lots of jars
 $ find /usr/lib/hbase -name '*.jar' | grep thrift
 /usr/lib/hbase/lib/libthrift-0.9.0.jar
 $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
 $
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6696) Implement DBMD.getIndexInfo()

2014-03-18 Thread Jonathan Seidman (JIRA)
Jonathan Seidman created HIVE-6696:
--

 Summary: Implement DBMD.getIndexInfo()
 Key: HIVE-6696
 URL: https://issues.apache.org/jira/browse/HIVE-6696
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.12.0
Reporter: Jonathan Seidman
Priority: Minor


HiveDatabaseMetaData.getIndexInfo() currently throws a not supported 
exception. There seems to be no technical obstacle to implementing this to 
return index info for tables with indexes defined, and probably an empty 
ResultSet for tables with no indexes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name

2014-03-18 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Attachment: HIVE-6687.patch

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1

 Attachments: HIVE-6687.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name

2014-03-18 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-6687:
-

Status: Patch Available  (was: Open)

 JDBC ResultSet fails to get value by qualified projection name
 --

 Key: HIVE-6687
 URL: https://issues.apache.org/jira/browse/HIVE-6687
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.12.1

 Attachments: HIVE-6687.patch


 Getting value from result set using fully qualified name would throw 
 exception. Only solution today is to use position of the column as opposed to 
 column label.
 {code}
 String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y;
 ResultSet res = stmt.executeQuery(sql);
 res.getInt(r1.x);
 {code}
 res.getInt(r1.x); would throw exception unknown column even though sql 
 specifies it.
 Fix is to fix resultsetschema in semantic analyzer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]

2014-03-18 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan resolved HIVE-6695.


Resolution: Fixed

Committed. Thanks, Nick!

 bin/hcat should include hbase jar and dependencies in the classpath 
 [followup/clone of HCATALOG-621]
 

 Key: HIVE-6695
 URL: https://issues.apache.org/jira/browse/HIVE-6695
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Nick Dimiduk
 Attachments: HIVE-6695.patch


 This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
 seems to be in read-only mode. To quote Nick from the original bug:
 I'm not sure how this fixes anything for the error listed above. The find 
 command in the script we merged is broken, at least on linux. Maybe it worked 
 with BSD find and we both tested on Macs?
 From the patch we committed:
 {noformat}
 if [ -d ${HBASE_HOME} ] ; then
for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
   HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
done
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
 fi
 {noformat}
 The find command syntax is wrong – it returns no jars ever.
 {noformat}
 $ find /usr/lib/hbase -name *.jar
 $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
 $
 {noformat}
 What we need is more like:
 {noformat}
 $ find /usr/lib/hbase -name '*.jar'
 ... // prints lots of jars
 $ find /usr/lib/hbase -name '*.jar' | grep thrift
 /usr/lib/hbase/lib/libthrift-0.9.0.jar
 $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
 $
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6645) to_date()/to_unix_timestamp() fail with NPE if input is null

2014-03-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939981#comment-13939981
 ] 

Ashutosh Chauhan commented on HIVE-6645:


+1

 to_date()/to_unix_timestamp() fail with NPE if input is null
 

 Key: HIVE-6645
 URL: https://issues.apache.org/jira/browse/HIVE-6645
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6645.1.patch, HIVE-6645.2.patch, HIVE-6645.2.patch


 {noformat}
 hive describe tab2;
 Query ID = jdere_20140312185454_e3ed213e-8b3a-4963-b815-19965edad587
 OK
 c1timestamp   None
 Time taken: 0.155 seconds, Fetched: 1 row(s)
 hive select * from tab2;
 Query ID = jdere_20140312185454_8a009070-df79-45de-8642-e85668a378d7
 OK
 NULL
 NULL
 NULL
 NULL
 NULL
 Time taken: 0.067 seconds, Fetched: 5 row(s)
 hive select to_unix_timestamp(c1) from tab2;   
 hive select to_date(c1) from tab2;  
 {noformat}
 Fails with errors like:
 {noformat}
 java.lang.Exception: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {c1:null}
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:401)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {c1:null}
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:233)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:680)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {c1:null}
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
 at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
 ... 10 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
 to_date(c1)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
 ... 11 more
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDFDate.evaluate(GenericUDFDate.java:106)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79)
 ... 15 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]

2014-03-18 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939986#comment-13939986
 ] 

Nick Dimiduk commented on HIVE-6695:


Thanks [~sushanth]!

 bin/hcat should include hbase jar and dependencies in the classpath 
 [followup/clone of HCATALOG-621]
 

 Key: HIVE-6695
 URL: https://issues.apache.org/jira/browse/HIVE-6695
 Project: Hive
  Issue Type: Bug
Reporter: Sushanth Sowmyan
Assignee: Nick Dimiduk
 Attachments: HIVE-6695.patch


 This is to address the addendum of HCATALOG-621, now that the HCatalog jira 
 seems to be in read-only mode. To quote Nick from the original bug:
 I'm not sure how this fixes anything for the error listed above. The find 
 command in the script we merged is broken, at least on linux. Maybe it worked 
 with BSD find and we both tested on Macs?
 From the patch we committed:
 {noformat}
 if [ -d ${HBASE_HOME} ] ; then
for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do
   HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar}
done
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH}
 fi
 {noformat}
 The find command syntax is wrong – it returns no jars ever.
 {noformat}
 $ find /usr/lib/hbase -name *.jar
 $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar
 $
 {noformat}
 What we need is more like:
 {noformat}
 $ find /usr/lib/hbase -name '*.jar'
 ... // prints lots of jars
 $ find /usr/lib/hbase -name '*.jar' | grep thrift
 /usr/lib/hbase/lib/libthrift-0.9.0.jar
 $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift
 $
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6660) HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request

2014-03-18 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-6660:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to trunk and 0.13 release branch.
Thanks Thejas and Vaibhav for review

 HiveServer2 running in non-http mode closes server socket for an SSL 
 connection after the 1st request
 -

 Key: HIVE-6660
 URL: https://issues.apache.org/jira/browse/HIVE-6660
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Prasad Mujumdar
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6660.1.patch, HIVE-6660.1.patch, hive-site.xml


 *Beeline connection string:*
 {code}
 !connect 
 jdbc:hive2://host:1/;ssl=true;sslTrustStore=/usr/share/doc/hive-0.13.0.2.1.1.0/examples/files/truststore.jks;trustStorePassword=HiveJdbc
  vgumashta vgumashta org.apache.hive.jdbc.HiveDriver 
 {code}
 *Error:*
 {code}
 pool-7-thread-1, handling exception: java.net.SocketTimeoutException: Read 
 timed out
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 pool-7-thread-1, SEND TLSv1 ALERT:  warning, description = close_notify
 Padded plaintext before ENCRYPTION:  len = 32
 : 01 00 BE 72 AC 10 3B FA   4E 01 A5 DE 9B 14 16 AF  ...r..;.N...
 0010: 4E DD 7A 29 AD B4 09 09   09 09 09 09 09 09 09 09  N.z)
 pool-7-thread-1, WRITE: TLSv1 Alert, length = 32
 [Raw write]: length = 37
 : 15 03 01 00 20 6C 37 82   A8 52 40 DA FB 83 2D CD   l7..R@...-.
 0010: 96 9F F0 B7 22 17 E1 04   C1 D1 93 1B C4 39 5A B0  9Z.
 0020: A2 3F 5D 7D 2D .?].-
 pool-7-thread-1, called closeSocket(selfInitiated)
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 pool-7-thread-1, called close()
 pool-7-thread-1, called closeInternal(true)
 {code}
 *Subsequent queries fail:*
 {code}
 main, WRITE: TLSv1 Application Data, length = 144
 main, handling exception: java.net.SocketException: Broken pipe
 %% Invalidated:  [Session-1, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
 main, SEND TLSv1 ALERT:  fatal, description = unexpected_message
 Padded plaintext before ENCRYPTION:  len = 32
 : 02 0A 52 C3 18 B1 C1 38   DB 3F B6 D1 C5 CA 14 9C  ..R8.?..
 0010: A5 38 4C 01 31 69 09 09   09 09 09 09 09 09 09 09  .8L.1i..
 main, WRITE: TLSv1 Alert, length = 32
 main, Exception sending alert: java.net.SocketException: Broken pipe
 main, called closeSocket()
 Error: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe (state=08S01,code=0)
 java.sql.SQLException: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:226)
   at org.apache.hive.beeline.Commands.execute(Commands.java:736)
   at org.apache.hive.beeline.Commands.sql(Commands.java:657)
   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:796)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Broken pipe
   at 
 org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
   at 
 org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:471)
   at 
 org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37)
   at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:219)
   at 
 org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:211)
   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:220)
   ... 11 more
 Caused by: java.net.SocketException: Broken pipe
   at java.net.SocketOutputStream.socketWrite0(Native Method)
   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
   at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
   at 

[jira] [Commented] (HIVE-3969) Session state for hive server should be cleaned-up

2014-03-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940009#comment-13940009
 ] 

Navis commented on HIVE-3969:
-

Ok, I'll take a look. I wish HIVE-3907 would be considered, too.

 Session state for hive server should be cleaned-up
 --

 Key: HIVE-3969
 URL: https://issues.apache.org/jira/browse/HIVE-3969
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3969.D8325.1.patch


 Currently add jar command by clients are adding child ClassLoader to worker 
 thread cumulatively, causing various problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6364) HiveServer2 - Request serving thread should get class loader from existing SessionState

2014-03-18 Thread Jaideep Dhok (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940018#comment-13940018
 ] 

Jaideep Dhok commented on HIVE-6364:


[~ashutoshc] I will put up a new patch.
[~jdere] Add jar will always update the class loader. That's the current 
behaviour. I think the first class loader is set using the conf.getClassLoader 
method, if nothing is set it will return the default class loader.



 HiveServer2 - Request serving thread should get class loader from existing 
 SessionState
 ---

 Key: HIVE-6364
 URL: https://issues.apache.org/jira/browse/HIVE-6364
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Jaideep Dhok
 Attachments: HIVE-6364.1.patch


 SessionState is created for each session in HS2. If we do any add jars, a 
 class loader is set in the SessionState's conf object. This class loader 
 should also be set in each thread that serves request of the same session.
 Scenario (both requests are in the same session)-
 {noformat}
 // req 1
 add jar foo.jar // Served by thread th1, this updates class loader and sets 
 in SessionState.conf
 // req2 served by th2, such that th1 != th2
 CREATE TEMPORARY FUNCTION foo_udf AS 'some class in foo.jar' 
 // This can throw class not found error, because although 
 // the new thread (th2) gets the same session state as th1,
 // the class loader is different (Thread.currentThread.getContextClassLoader()
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego

2014-03-18 Thread Dilli Arumugam (JIRA)
Dilli Arumugam created HIVE-6697:


 Summary: HiveServer2 secure thrift/http authentication needs to 
support SPNego 
 Key: HIVE-6697
 URL: https://issues.apache.org/jira/browse/HIVE-6697
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Dilli Arumugam


Looking to integrating Apache Knox to work with HiveServer2 secure thrift/http.

Found that thrift/http uses some form of Kerberos authentication that is not 
SPNego. Considering it is going over http protocol, expected it to use SPNego 
protocol.

Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase 
Stargate using SPNego for authentication.

Requesting that HiveServer2 secure thrift/http authentication support SPNego.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego

2014-03-18 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-6697:
--

Assignee: Vaibhav Gumashta

 HiveServer2 secure thrift/http authentication needs to support SPNego 
 --

 Key: HIVE-6697
 URL: https://issues.apache.org/jira/browse/HIVE-6697
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Dilli Arumugam
Assignee: Vaibhav Gumashta

 Looking to integrating Apache Knox to work with HiveServer2 secure 
 thrift/http.
 Found that thrift/http uses some form of Kerberos authentication that is not 
 SPNego. Considering it is going over http protocol, expected it to use SPNego 
 protocol.
 Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase 
 Stargate using SPNego for authentication.
 Requesting that HiveServer2 secure thrift/http authentication support SPNego.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6676) hcat cli fails to run when running with hive on tez

2014-03-18 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-6676:


Fix Version/s: 0.13.0

 hcat cli fails to run when running with hive on tez
 ---

 Key: HIVE-6676
 URL: https://issues.apache.org/jira/browse/HIVE-6676
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Blocker
 Fix For: 0.13.0

 Attachments: HIVE-6676.patch


 HIVE_CLASSPATH should be added to HADOOP_CLASSPATH before launching hcat CLI



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 19322: HIVE-6685 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments

2014-03-18 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19322/
---

(Updated March 19, 2014, 1:25 a.m.)


Review request for hive and Xuefu Zhang.


Changes
---

Refactored the arg-passing from manual list iteration, to use a simple 
extension of GNUParser.  Mostly borrowing the code from HiveCLI.

It is needed to extend the GNUParser because they dont support unknown 
arguments.  In beeline case, these are the 'property-files' and the 
reflectively-set BeelineOpts like 'autoCommit', etc.

Adding a unit test to verify the parsing doesn't break anything.


Bugs: HIVE-6685
https://issues.apache.org/jira/browse/HIVE-6685


Repository: hive-git


Description
---

Improving the error-handling in ArrayIndexOutOfBoundsException of Beeline.


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 3482186 
  beeline/src/test/org/apache/hive/beeline/TestBeelineArgParsing.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/19322/diff/


Testing
---

Manual test.  Now, in this scenario it will display the usage like:

beeline -u
Usage: java org.apache.hive.cli.beeline.BeeLine 
   -u database url   the JDBC URL to connect to
   -n username   the username to connect as
   -p password   the password to connect as
...


Thanks,

Szehon Ho



[jira] [Updated] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments

2014-03-18 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6685:


Attachment: HIVE-6685.2.patch

Thanks for the review and suggestion.  I refactored Beeline to use the GNU 
Parser, it is a much cleaner solution.

 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
 --

 Key: HIVE-6685
 URL: https://issues.apache.org/jira/browse/HIVE-6685
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.12.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-6685.2.patch, HIVE-6685.patch


 Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched 
 arguments in beeline prompt.  It would be nice to cleanup.
 Example:
 {noformat}
 beeline -u szehon -p
 Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3
   at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560)
   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628)
   at 
 org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366)
   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 19387: Session state for hive server should be cleaned-up

2014-03-18 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19387/
---

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-3969
https://issues.apache.org/jira/browse/HIVE-3969


Repository: hive-git


Description
---

Currently add jar command by clients are adding child ClassLoader to worker 
thread cumulatively, causing various problems.


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 0dba331 
  ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java a249d74 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/LineageCtx.java 
8d6ebaa 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 5546d03 
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java 
32e78ac 
  
ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPNumeric.java 
2ada2ff 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
ace791a 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
ef5b5c6 

Diff: https://reviews.apache.org/r/19387/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Commented] (HIVE-3969) Session state for hive server should be cleaned-up

2014-03-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940052#comment-13940052
 ] 

Navis commented on HIVE-3969:
-

[~ashutoshc] I just fixed hiveserver2 case. Let's deprecate old hiveserver.

 Session state for hive server should be cleaned-up
 --

 Key: HIVE-3969
 URL: https://issues.apache.org/jira/browse/HIVE-3969
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3969.1.patch.txt, HIVE-3969.D8325.1.patch


 Currently add jar command by clients are adding child ClassLoader to worker 
 thread cumulatively, causing various problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-3969) Session state for hive server should be cleaned-up

2014-03-18 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3969:


Attachment: HIVE-3969.1.patch.txt

 Session state for hive server should be cleaned-up
 --

 Key: HIVE-3969
 URL: https://issues.apache.org/jira/browse/HIVE-3969
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3969.1.patch.txt, HIVE-3969.D8325.1.patch


 Currently add jar command by clients are adding child ClassLoader to worker 
 thread cumulatively, causing various problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6677) HBaseSerDe needs to be refactored

2014-03-18 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6677:
--

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks to Prasad for the review.

 HBaseSerDe needs to be refactored
 -

 Key: HIVE-6677
 URL: https://issues.apache.org/jira/browse/HIVE-6677
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0, 0.11.0, 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.14.0

 Attachments: HIVE-6677.1.patch, HIVE-6677.2.patch, HIVE-6677.3.patch, 
 HIVE-6677.patch


 The code in HBaseSerde seems very complex and hard to be extend to support 
 new features such as adding generic compound key (HIVE-6411) and Compound key 
 filter (HIVE-6290), especially when handling key/field serialization. Hope 
 this task will clean up the code a bit and make it ready for new extensions. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead

2014-03-18 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6430:
---

Attachment: HIVE-6430.05.patch

rebase, incorporate not enabling for decimal

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
 HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead

2014-03-18 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18936/
---

(Updated March 19, 2014, 2:40 a.m.)


Review request for hive, Gopal V and Gunther Hagleitner.


Repository: hive-git


Description
---

See JIRA


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b0f5c49 
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 2cd65cb 
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
 704fcb9 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 7dbb8be 
  ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 170e8c0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 3ea9c96 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
 8854b19 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
9df425b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
64f0be2 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 
008a8db 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
 988959f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
 55b7415 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 79af08d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
eef7656 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java 
0fd4983 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java
 PRE-CREATION 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
 65e3779 
  
ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
 755d783 
  ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be 
  ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 
  ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out d79b984 
  ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 3c55b5c 
  ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 
  serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 
  serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION 
  serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 
5870884 
  
serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
 bab505e 
  serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 
6f344bb 
  serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd 
  serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java 
a99c7b4 
  serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 
435d6c6 
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
82c1263 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 
b188c3f 
  serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUtils.java 
6c14081 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
 06d5c5e 
  serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java 
868dd4c 
  
serde/src/test/org/apache/hadoop/hive/serde2/thrift_test/CreateSequenceFile.java
 1fb49e5 

Diff: https://reviews.apache.org/r/18936/diff/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead

2014-03-18 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18936/#review37681
---



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java
https://reviews.apache.org/r/18936/#comment69316

should have been changed, will do


- Sergey Shelukhin


On March 19, 2014, 2:40 a.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18936/
 ---
 
 (Updated March 19, 2014, 2:40 a.m.)
 
 
 Review request for hive, Gopal V and Gunther Hagleitner.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 See JIRA
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b0f5c49 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 2cd65cb 
   
 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
  704fcb9 
   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 7dbb8be 
   ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 
 170e8c0 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 3ea9c96 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
  8854b19 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 
 9df425b 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 
 64f0be2 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java
  008a8db 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
  988959f 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
  55b7415 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 79af08d 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
 eef7656 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java
  0fd4983 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
  65e3779 
   
 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
  755d783 
   ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be 
   ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 
   ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out d79b984 
   ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 3c55b5c 
   ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 
   serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 
   serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION 
   serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 
 5870884 
   
 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
  bab505e 
   serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 
 6f344bb 
   serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd 
   serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java 
 a99c7b4 
   serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 
 435d6c6 
   serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
 82c1263 
   
 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java 
 b188c3f 
   
 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUtils.java 
 6c14081 
   
 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
  06d5c5e 
   serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java 
 868dd4c 
   
 serde/src/test/org/apache/hadoop/hive/serde2/thrift_test/CreateSequenceFile.java
  1fb49e5 
 
 Diff: https://reviews.apache.org/r/18936/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Commented] (HIVE-5998) Add vectorized reader for Parquet files

2014-03-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940130#comment-13940130
 ] 

Hive QA commented on HIVE-5998:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12635061/HIVE-5998.10.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5412 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1869/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1869/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12635061

 Add vectorized reader for Parquet files
 ---

 Key: HIVE-5998
 URL: https://issues.apache.org/jira/browse/HIVE-5998
 Project: Hive
  Issue Type: Sub-task
  Components: Serializers/Deserializers, Vectorization
Reporter: Remus Rusanu
Assignee: Remus Rusanu
Priority: Minor
  Labels: Parquet, vectorization
 Attachments: HIVE-5998.1.patch, HIVE-5998.10.patch, 
 HIVE-5998.2.patch, HIVE-5998.3.patch, HIVE-5998.4.patch, HIVE-5998.5.patch, 
 HIVE-5998.6.patch, HIVE-5998.7.patch, HIVE-5998.8.patch, HIVE-5998.9.patch


 HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar 
 format, it makes sense to provide a vectorized reader, similar to how RC and 
 ORC formats have, to benefit from vectorized execution engine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6579) HiveLockObjectData constructor makes too many queryStr instance causing oom

2014-03-18 Thread xieyuchen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xieyuchen updated HIVE-6579:


Attachment: HIVE-6579.02.patch

 HiveLockObjectData constructor makes too many queryStr instance causing oom
 ---

 Key: HIVE-6579
 URL: https://issues.apache.org/jira/browse/HIVE-6579
 Project: Hive
  Issue Type: Improvement
Reporter: xieyuchen
 Attachments: HIVE-6579.02.patch, HIVE-6579.1.patch.txt


 We have a huge sql which full outer joins 10+ partitoned tables, each table 
 has at least 1k partitions. The sql has 300kb in length(it constructed  
 automatically of cause).
 So when we running this sql, there are over 10k HiveLockObjectData instances. 
 Because of the constructor of HiveLockObjectData trim the queryStr, there 
 will be 10k individual String instances, each has 300kb in length! Then the 
 Hive client will get an oom exception.
 Trying to trim the queryStr in Driver.compile function instead of doing it in 
 HiveLockObjectData constructor to reduce memory wasting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6579) HiveLockObjectData constructor makes too many queryStr instance causing oom

2014-03-18 Thread xieyuchen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xieyuchen updated HIVE-6579:


Status: Open  (was: Patch Available)

 HiveLockObjectData constructor makes too many queryStr instance causing oom
 ---

 Key: HIVE-6579
 URL: https://issues.apache.org/jira/browse/HIVE-6579
 Project: Hive
  Issue Type: Improvement
Reporter: xieyuchen
 Attachments: HIVE-6579.02.patch, HIVE-6579.1.patch.txt


 We have a huge sql which full outer joins 10+ partitoned tables, each table 
 has at least 1k partitions. The sql has 300kb in length(it constructed  
 automatically of cause).
 So when we running this sql, there are over 10k HiveLockObjectData instances. 
 Because of the constructor of HiveLockObjectData trim the queryStr, there 
 will be 10k individual String instances, each has 300kb in length! Then the 
 Hive client will get an oom exception.
 Trying to trim the queryStr in Driver.compile function instead of doing it in 
 HiveLockObjectData constructor to reduce memory wasting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6579) HiveLockObjectData constructor makes too many queryStr instance causing oom

2014-03-18 Thread xieyuchen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xieyuchen updated HIVE-6579:


Status: Patch Available  (was: Open)

 HiveLockObjectData constructor makes too many queryStr instance causing oom
 ---

 Key: HIVE-6579
 URL: https://issues.apache.org/jira/browse/HIVE-6579
 Project: Hive
  Issue Type: Improvement
Reporter: xieyuchen
 Attachments: HIVE-6579.02.patch, HIVE-6579.1.patch.txt


 We have a huge sql which full outer joins 10+ partitoned tables, each table 
 has at least 1k partitions. The sql has 300kb in length(it constructed  
 automatically of cause).
 So when we running this sql, there are over 10k HiveLockObjectData instances. 
 Because of the constructor of HiveLockObjectData trim the queryStr, there 
 will be 10k individual String instances, each has 300kb in length! Then the 
 Hive client will get an oom exception.
 Trying to trim the queryStr in Driver.compile function instead of doing it in 
 HiveLockObjectData constructor to reduce memory wasting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-3969) Session state for hive server should be cleaned-up

2014-03-18 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940208#comment-13940208
 ] 

Navis commented on HIVE-3969:
-

Didn't verified fd leakage. Just verified newly created loaders are released 
when session is closed.

 Session state for hive server should be cleaned-up
 --

 Key: HIVE-3969
 URL: https://issues.apache.org/jira/browse/HIVE-3969
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-3969.1.patch.txt, HIVE-3969.D8325.1.patch


 Currently add jar command by clients are adding child ClassLoader to worker 
 thread cumulatively, causing various problems.



--
This message was sent by Atlassian JIRA
(v6.2#6252)