Re: Review Request 19212: HIVE-6645: to_date()/to_unix_timestamp() fail with NPE if input is null
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19212/#review37540 --- Ship it! +1 - Mohammad Islam On March 14, 2014, 9:23 p.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19212/ --- (Updated March 14, 2014, 9:23 p.m.) Review request for hive. Bugs: HIVE-6645 https://issues.apache.org/jira/browse/HIVE-6645 Repository: hive-git Description --- - fix null inputs - allow char/varchar params - tests Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDate.java c31174a ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java dc259c6 ql/src/test/org/apache/hadoop/hive/ql/udf/TestGenericUDFDate.java 384ce4e ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java PRE-CREATION Diff: https://reviews.apache.org/r/19212/diff/ Testing --- Thanks, Jason Dere
[jira] [Commented] (HIVE-6660) HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request
[ https://issues.apache.org/jira/browse/HIVE-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938884#comment-13938884 ] Prasad Mujumdar commented on HIVE-6660: --- Patch committed to trunk. [~rhbutani] This should be a blocker for hive 0.13. Requesting approval to push the patch to 0.13 release branch. HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request - Key: HIVE-6660 URL: https://issues.apache.org/jira/browse/HIVE-6660 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Prasad Mujumdar Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6660.1.patch, HIVE-6660.1.patch, hive-site.xml *Beeline connection string:* {code} !connect jdbc:hive2://host:1/;ssl=true;sslTrustStore=/usr/share/doc/hive-0.13.0.2.1.1.0/examples/files/truststore.jks;trustStorePassword=HiveJdbc vgumashta vgumashta org.apache.hive.jdbc.HiveDriver {code} *Error:* {code} pool-7-thread-1, handling exception: java.net.SocketTimeoutException: Read timed out pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) pool-7-thread-1, SEND TLSv1 ALERT: warning, description = close_notify Padded plaintext before ENCRYPTION: len = 32 : 01 00 BE 72 AC 10 3B FA 4E 01 A5 DE 9B 14 16 AF ...r..;.N... 0010: 4E DD 7A 29 AD B4 09 09 09 09 09 09 09 09 09 09 N.z) pool-7-thread-1, WRITE: TLSv1 Alert, length = 32 [Raw write]: length = 37 : 15 03 01 00 20 6C 37 82 A8 52 40 DA FB 83 2D CD l7..R@...-. 0010: 96 9F F0 B7 22 17 E1 04 C1 D1 93 1B C4 39 5A B0 9Z. 0020: A2 3F 5D 7D 2D .?].- pool-7-thread-1, called closeSocket(selfInitiated) pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) {code} *Subsequent queries fail:* {code} main, WRITE: TLSv1 Application Data, length = 144 main, handling exception: java.net.SocketException: Broken pipe %% Invalidated: [Session-1, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] main, SEND TLSv1 ALERT: fatal, description = unexpected_message Padded plaintext before ENCRYPTION: len = 32 : 02 0A 52 C3 18 B1 C1 38 DB 3F B6 D1 C5 CA 14 9C ..R8.?.. 0010: A5 38 4C 01 31 69 09 09 09 09 09 09 09 09 09 09 .8L.1i.. main, WRITE: TLSv1 Alert, length = 32 main, Exception sending alert: java.net.SocketException: Broken pipe main, called closeSocket() Error: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe (state=08S01,code=0) java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:226) at org.apache.hive.beeline.Commands.execute(Commands.java:736) at org.apache.hive.beeline.Commands.sql(Commands.java:657) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:796) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) at org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:471) at org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65) at org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:219) at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:211) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:220) ... 11 more Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) at
Re: Review Request 19329: Make it configurable to have partition columns displayed separately or not.
On March 17, 2014, 11:42 p.m., Jason Dere wrote: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 837 https://reviews.apache.org/r/19329/diff/1/?file=525742#file525742line837 I'm sure Lefty will mention this too, I believe new config settings also should have updated entry in conf/hive-default.xml.template. Yes, but no. It all depends on when HIVE-6037 gets committed because after that hive-default.xml.template will be generated from HiveConf.java, which will include descriptions in the parameter definitions. Anyway, the new patch for this jira has a description in hive-default.xml.template so that can go into HiveConf.java when the time comes. - Lefty --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19329/#review37501 --- On March 17, 2014, 11:57 p.m., Ashutosh Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19329/ --- (Updated March 17, 2014, 11:57 p.m.) Review request for hive and Jason Dere. Bugs: HIVE-6689 https://issues.apache.org/jira/browse/HIVE-6689 Repository: hive-git Description --- Make it configurable to have partition columns displayed separately or not. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b0f5c49 conf/hive-default.xml.template a8da2ca ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java de04cca ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java 0c49250 ql/src/test/queries/clientpositive/desc_tbl_part_cols.q PRE-CREATION ql/src/test/results/clientpositive/desc_tbl_part_cols.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19329/diff/ Testing --- Added a test case. Thanks, Ashutosh Chauhan
[jira] [Commented] (HIVE-6658) Modify Alter_numbuckets* test to reflect hadoop2 changes
[ https://issues.apache.org/jira/browse/HIVE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938916#comment-13938916 ] Szehon Ho commented on HIVE-6658: - Yea, looks good, thanks Modify Alter_numbuckets* test to reflect hadoop2 changes Key: HIVE-6658 URL: https://issues.apache.org/jira/browse/HIVE-6658 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Attachments: HIVE-6658.2.patch Hadoop2 now honors number of reducers config while running in local mode. This affects bucketing tests as the data gets properly bucketed in Hadoop2 (In hadoop1 all data ended up in same bucket while in local mode). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6692) Location for new table or partition should be a write entity
Navis created HIVE-6692: --- Summary: Location for new table or partition should be a write entity Key: HIVE-6692 URL: https://issues.apache.org/jira/browse/HIVE-6692 Project: Hive Issue Type: Task Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Locations for create table and alter table add partitionshould be write entities. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 19329: Make it configurable to have partition columns displayed separately or not.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19329/#review37543 --- Ship it! - Jason Dere On March 17, 2014, 11:57 p.m., Ashutosh Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19329/ --- (Updated March 17, 2014, 11:57 p.m.) Review request for hive and Jason Dere. Bugs: HIVE-6689 https://issues.apache.org/jira/browse/HIVE-6689 Repository: hive-git Description --- Make it configurable to have partition columns displayed separately or not. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b0f5c49 conf/hive-default.xml.template a8da2ca ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java de04cca ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java 0c49250 ql/src/test/queries/clientpositive/desc_tbl_part_cols.q PRE-CREATION ql/src/test/results/clientpositive/desc_tbl_part_cols.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19329/diff/ Testing --- Added a test case. Thanks, Ashutosh Chauhan
[jira] [Commented] (HIVE-6689) Provide an option to not display partition columns separately in describe table output
[ https://issues.apache.org/jira/browse/HIVE-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938925#comment-13938925 ] Jason Dere commented on HIVE-6689: -- +1 Provide an option to not display partition columns separately in describe table output --- Key: HIVE-6689 URL: https://issues.apache.org/jira/browse/HIVE-6689 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-6689.1.patch, HIVE-6689.patch In ancient Hive partition columns were not displayed differently, in newer version they are displayed differently. This has resulted in backward incompatible change for upgrade scenarios. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6692) Location for new table or partition should be a write entity
[ https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6692: Attachment: HIVE-6692.1.patch.txt Location for new table or partition should be a write entity Key: HIVE-6692 URL: https://issues.apache.org/jira/browse/HIVE-6692 Project: Hive Issue Type: Task Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6692.1.patch.txt Locations for create table and alter table add partitionshould be write entities. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 19344: Location for new table or partition should be a write entity
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19344/ --- Review request for hive and Thejas Nair. Bugs: HIVE-6692 https://issues.apache.org/jira/browse/HIVE-6692 Repository: hive-git Description --- Locations for create table and alter table add partitionshould be write entities. Diffs - common/src/java/org/apache/hadoop/hive/common/FileUtils.java 16d7c80 ql/src/java/org/apache/hadoop/hive/ql/Driver.java 7dbb8be ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java 2a38aad ql/src/java/org/apache/hadoop/hive/ql/hooks/ReadEntity.java 6d7c4f6 ql/src/java/org/apache/hadoop/hive/ql/hooks/WriteEntity.java 44a3924 ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java db9fa74 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java e642919 ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 92ec334 ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 6c53447 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java e1e427f ql/src/test/results/clientnegative/archive_multi7.q.out a8eee2f ql/src/test/results/clientnegative/authorization_droppartition.q.out 1da250a ql/src/test/results/clientnegative/authorization_uri_alterpart_loc.q.out 39a4e4f ql/src/test/results/clientnegative/authorization_uri_create_table1.q.out 0b8182a ql/src/test/results/clientnegative/authorization_uri_create_table_ext.q.out 0b8182a ql/src/test/results/clientnegative/deletejar.q.out 91560ee ql/src/test/results/clientnegative/exim_20_managed_location_over_existing.q.out fd4a418 ql/src/test/results/clientnegative/external1.q.out 696beaa ql/src/test/results/clientnegative/external2.q.out a604885 ql/src/test/results/clientnegative/insertexternal1.q.out 3df5013 Diff: https://reviews.apache.org/r/19344/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-6692) Location for new table or partition should be a write entity
[ https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6692: Status: Patch Available (was: Open) Location for new table or partition should be a write entity Key: HIVE-6692 URL: https://issues.apache.org/jira/browse/HIVE-6692 Project: Hive Issue Type: Task Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6692.1.patch.txt Locations for create table and alter table add partitionshould be write entities. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6222: --- Attachment: HIVE-6222.4.patch .4.patch rebased to latest trunk and merges HIVE-6518 Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6222: --- Status: Open (was: Patch Available) Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6222: --- Status: Patch Available (was: Open) Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6222) Make Vector Group By operator abandon grouping if too many distinct keys
[ https://issues.apache.org/jira/browse/HIVE-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938967#comment-13938967 ] Remus Rusanu commented on HIVE-6222: [~gopalv] I've merged the HIVE-6518 fix into the refactoring of VectorGroupByOperator, see .4.patch. Everything GCCanary related is moved into ProcessingModeHashAggregate class. Make Vector Group By operator abandon grouping if too many distinct keys Key: HIVE-6222 URL: https://issues.apache.org/jira/browse/HIVE-6222 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: vectorization Attachments: HIVE-6222.1.patch, HIVE-6222.2.patch, HIVE-6222.3.patch, HIVE-6222.4.patch Row mode GBY is becoming a pass-through if not enough aggregation occurs on the map side, relying on the shuffle+reduce side to do the work. Have VGBY do the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead
[ https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939024#comment-13939024 ] Hive QA commented on HIVE-6430: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12635047/HIVE-6430.04.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5417 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1867/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1867/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12635047 MapJoin hash table has large memory overhead Key: HIVE-6430 URL: https://issues.apache.org/jira/browse/HIVE-6430 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.patch Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 for row) can take several hundred bytes, which is ridiculous. I am reducing the size of MJKey and MJRowContainer in other jiras, but in general we don't need to have java hash table there. We can either use primitive-friendly hashtable like the one from HPPC (Apache-licenced), or some variation, to map primitive keys to single row storage structure without an object per row (similar to vectorization). -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18943: Make Vector Group By operator abandon grouping if too many distinct keys
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18943/ --- (Updated March 18, 2014, 10:09 a.m.) Review request for hive, Eric Hanson and Jitendra Pandey. Changes --- .4.patch Bugs: HIVE-6222 https://issues.apache.org/jira/browse/HIVE-6222 Repository: hive-git Description --- See HIVE-6222 Diffs (updated) - ql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvg.txt 547a60a ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMax.txt dcc1dfb ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxDecimal.txt 37ce103 ql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxString.txt 1f8b28c ql/src/gen/vectorization/UDAFTemplates/VectorUDAFSum.txt cb0be33 ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVar.txt 49b0edd ql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt c5af930 ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java c4c85fa ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorAggregationBufferRow.java 7aa4b11 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java 7fb007e ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java a2a7266 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java bd6c24b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorUtilBatchObjectPool.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorAggregateExpression.java 1836169 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFAvgDecimal.java 5127107 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFCount.java 086f91f ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFCountStar.java 4926f6c ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFSumDecimal.java 0089ad3 Diff: https://reviews.apache.org/r/18943/diff/ Testing --- Manually tested. I plan to add test cases in TestVGBy Thanks, Remus Rusanu
[jira] [Commented] (HIVE-6468) HS2 out of memory error when curl sends a get request
[ https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939085#comment-13939085 ] Navis commented on HIVE-6468: - [~leftylev] cannot - must not would be better. You can call it but that will make hiveserver die. HS2 out of memory error when curl sends a get request - Key: HIVE-6468 URL: https://issues.apache.org/jira/browse/HIVE-6468 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Environment: Centos 6.3, hive 12, hadoop-2.2 Reporter: Abin Shahab Assignee: Navis Attachments: HIVE-6468.1.patch.txt We see an out of memory error when we run simple beeline calls. (The hive.server2.transport.mode is binary) curl localhost:1 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.2#6252)
Indexes without Map/Reduce
Hi, Hope that this is to the correct list. Apologies if not. I am using Hive 0.11.0 and Hadoop 1.0.4. My goal is to get my Hive queries running without Map/Reduce but using my custom indexes. To this end I have been building Hive version 13 from source and working through the sources to see what I can do. I can see that the non-M/R path through Hive splits off really early. I can see that in SemanticAnalyzer.java if it determines that a FetchTask is sufficient for the query then the genMapRedTasks method returns really early and it never gets near the code that uses indexes. I have also followed the code through the index code and I can see that in IndexWhereProcessor.java an index can insert a index query task to run before the main query. (By also calling the queryContext setIndexInputFormat and setIndexIntermediateFile methods it can redirect the main query to pick up the data generated by the index.) So I can see two approaches to achieve my goal. 1) I can modify the FetchTask path to support the use of indexes. 2) I can allow the query to start down the Map/Reduce path and then I can arrange for my index code to trash the original query completely and replace it with a query that will run as a FetchTask that will do what I want. Of course there are pros and cons to both of these approaches. 1) This approach has the advantage that I don't need to change the current index path at all and so there's much less likely that I will damage it. However I will probably end up replicating some of the existing index code, which is not desirable. Also I am not sufficiently au fait with the Hive code to feel confident that I would make such a major change in the way that a real Hive developer might. 2) This approach has the advantage that I am building on top of the existing index infrastructure and so I probably will end up writing less code. However it means that my queries will run once as Map Reduce and again as FetchTasks which will make them slower than I would like. The approach is also more complicated than I would like. And I don't really know how cleanly I can abort the initial query and replace it with a FetchTask. (if, indeed, this is possible.) Obviously at some point I would like for my changes to get submitted back into the main Hive source and so I want maximize the chances that they will be viewed positively. Does anyone have any opinions or advice to offer? Regards, Peter Marron Senior Developer Trillium Software, A Harte Hanks Company Theale Court, 1st Floor, 11-13 High Street Theale RG7 5AH +44 (0) 118 940 7609 office +44 (0) 118 940 7699 fax [https://4b2685446389bc779b46-5f66fbb59518cc4fcae8900db28267f5.ssl.cf2.rackcdn.com/trillium.png]http://www.trilliumsoftware.com/ trilliumsoftware.comhttp://www.trilliumsoftware.com/ / linkedinhttp://www.linkedin.com/company/17710 / twitterhttps://twitter.com/trilliumsw / facebookhttp://www.facebook.com/HarteHanks
[jira] [Updated] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.
[ https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6668: Attachment: HIVE-6668.3.patch.txt When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins. Key: HIVE-6668 URL: https://issues.apache.org/jira/browse/HIVE-6668 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Yin Huai Assignee: Navis Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6668.1.patch.txt, HIVE-6668.2.patch.txt, HIVE-6668.3.patch.txt I tried the following query today ... {code:sql} set mapred.job.map.memory.mb=2048; set mapred.job.reduce.memory.mb=2048; set mapred.map.child.java.opts=-server -Xmx3072m -Djava.net.preferIPv4Stack=true; set mapred.reduce.child.java.opts=-server -Xmx3072m -Djava.net.preferIPv4Stack=true; set mapred.reduce.tasks=60; set hive.stats.autogather=false; set hive.exec.parallel=false; set hive.enforce.bucketing=true; set hive.enforce.sorting=true; set hive.map.aggr=true; set hive.optimize.bucketmapjoin=true; set hive.optimize.bucketmapjoin.sortedmerge=true; set hive.mapred.reduce.tasks.speculative.execution=false; set hive.auto.convert.join=true; set hive.auto.convert.sortmerge.join=true; set hive.auto.convert.sortmerge.join.noconditionaltask=false; set hive.auto.convert.join.noconditionaltask=false; set hive.auto.convert.join.noconditionaltask.size=1; set hive.optimize.reducededuplication=true; set hive.optimize.reducededuplication.min.reducer=1; set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; set hive.mapjoin.smalltable.filesize=4500; set hive.optimize.index.filter=false; set hive.vectorized.execution.enabled=false; set hive.optimize.correlation=false; select i_item_id, s_state, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 FROM store_sales JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk) JOIN item on (store_sales.ss_item_sk = item.i_item_sk) JOIN customer_demographics on (store_sales.ss_cdemo_sk = customer_demographics.cd_demo_sk) JOIN store on (store_sales.ss_store_sk = store.s_store_sk) where cd_gender = 'F' and cd_marital_status = 'U' and cd_education_status = 'Primary' and d_year = 2002 and s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL') group by i_item_id, s_state with rollup order by i_item_id, s_state limit 100; {code} The log shows ... {code} 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve driver alias (threshold : 4500, length mapping : {store=94175, store_sales=48713909726, item=39798667, customer_demographics=1660831, date_dim=2275902}) Stage-27 is filtered out by condition resolver. 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition resolver. Stage-28 is filtered out by condition resolver. 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition resolver. Stage-3 is selected by condition resolver. {code} Stage-3 is a reduce join. Actually, the resolver should pick the map join -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939421#comment-13939421 ] Nick Dimiduk commented on HIVE-6650: Can someone give me some context for this build error? (cc [~sushanth], [~ashutoshc], [~brocknoland]) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler -- Key: HIVE-6650 URL: https://issues.apache.org/jira/browse/HIVE-6650 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch With the above enabled, where clauses including non-rowkey columns cannot be used with the HBaseStorageHandler. Job fails to launch with the following exception. {noformat} java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 WEST 56TH STREET') at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292) at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.lang.RuntimeException(Unexpected residual predicate (s_address = '200 WEST 56TH STREET'))' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask {noformat} I believe this bug was introduced in HIVE-2036, see change to OpProcFactory.java that always includes full predicate, even after storage handler negotiates the predicates it can pushdown. Since this behavior is divergent from input formats (they cannot negotiate), there's no harm in the SH ignoring non-indexed predicates -- Hive respects all of them at a layer above anyway. Might as well remove the check/exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939447#comment-13939447 ] Ashutosh Chauhan commented on HIVE-6650: It was not because of patch. Trunk was broken in interim. Its fixed now. Just reupload your patch. hive.optimize.index.filter breaks non-index where with HBaseStorageHandler -- Key: HIVE-6650 URL: https://issues.apache.org/jira/browse/HIVE-6650 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch With the above enabled, where clauses including non-rowkey columns cannot be used with the HBaseStorageHandler. Job fails to launch with the following exception. {noformat} java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 WEST 56TH STREET') at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292) at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.lang.RuntimeException(Unexpected residual predicate (s_address = '200 WEST 56TH STREET'))' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask {noformat} I believe this bug was introduced in HIVE-2036, see change to OpProcFactory.java that always includes full predicate, even after storage handler negotiates the predicates it can pushdown. Since this behavior is divergent from input formats (they cannot negotiate), there's no harm in the SH ignoring non-indexed predicates -- Hive respects all of them at a layer above anyway. Might as well remove the check/exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-6650: --- Attachment: HIVE-6650.2.patch Same as patch v1. hive.optimize.index.filter breaks non-index where with HBaseStorageHandler -- Key: HIVE-6650 URL: https://issues.apache.org/jira/browse/HIVE-6650 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch, HIVE-6650.2.patch With the above enabled, where clauses including non-rowkey columns cannot be used with the HBaseStorageHandler. Job fails to launch with the following exception. {noformat} java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 WEST 56TH STREET') at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292) at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.lang.RuntimeException(Unexpected residual predicate (s_address = '200 WEST 56TH STREET'))' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask {noformat} I believe this bug was introduced in HIVE-2036, see change to OpProcFactory.java that always includes full predicate, even after storage handler negotiates the predicates it can pushdown. Since this behavior is divergent from input formats (they cannot negotiate), there's no harm in the SH ignoring non-indexed predicates -- Hive respects all of them at a layer above anyway. Might as well remove the check/exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6364) HiveServer2 - Request serving thread should get class loader from existing SessionState
[ https://issues.apache.org/jira/browse/HIVE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939467#comment-13939467 ] Ashutosh Chauhan commented on HIVE-6364: Although, HIVE-3969 is marked as duplicate, I don't think it is a duplicate. This one fixes the problem of having right class loader for a thread serving the query, whereas HIVE-3969 talks about unloading registered jars. So, it seems there are two independent problem, both of which needs to be fixed. [~jaideepdhok] would you like to rebase your patch. HiveServer2 - Request serving thread should get class loader from existing SessionState --- Key: HIVE-6364 URL: https://issues.apache.org/jira/browse/HIVE-6364 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Jaideep Dhok Attachments: HIVE-6364.1.patch SessionState is created for each session in HS2. If we do any add jars, a class loader is set in the SessionState's conf object. This class loader should also be set in each thread that serves request of the same session. Scenario (both requests are in the same session)- {noformat} // req 1 add jar foo.jar // Served by thread th1, this updates class loader and sets in SessionState.conf // req2 served by th2, such that th1 != th2 CREATE TEMPORARY FUNCTION foo_udf AS 'some class in foo.jar' // This can throw class not found error, because although // the new thread (th2) gets the same session state as th1, // the class loader is different (Thread.currentThread.getContextClassLoader() {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3969) Session state for hive server should be cleaned-up
[ https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939478#comment-13939478 ] Ashutosh Chauhan commented on HIVE-3969: Although, HIVE-6364 is marked as duplicate, I don't think it is a duplicate. This one fixes the problem of unloading registered jars, whereas HIVE-6364 talks about setting correct class loader for HS2. So, it seems there are two independent problem, both of which needs to be fixed. [~navis] Although, you raised the bug for HS1, I think same exact problem is also on HS2. But, I think fix is in same area, so it doesn't really matter. I think we should use sun.misc.ClassLoaderUtil for now and than switch over to jdk7 in near future for a clean solution. Would you like to rebase this patch? Session state for hive server should be cleaned-up -- Key: HIVE-3969 URL: https://issues.apache.org/jira/browse/HIVE-3969 Project: Hive Issue Type: Bug Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3969.D8325.1.patch Currently add jar command by clients are adding child ClassLoader to worker thread cumulatively, causing various problems. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6650: --- Status: Open (was: Patch Available) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler -- Key: HIVE-6650 URL: https://issues.apache.org/jira/browse/HIVE-6650 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch, HIVE-6650.2.patch With the above enabled, where clauses including non-rowkey columns cannot be used with the HBaseStorageHandler. Job fails to launch with the following exception. {noformat} java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 WEST 56TH STREET') at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292) at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.lang.RuntimeException(Unexpected residual predicate (s_address = '200 WEST 56TH STREET'))' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask {noformat} I believe this bug was introduced in HIVE-2036, see change to OpProcFactory.java that always includes full predicate, even after storage handler negotiates the predicates it can pushdown. Since this behavior is divergent from input formats (they cannot negotiate), there's no harm in the SH ignoring non-indexed predicates -- Hive respects all of them at a layer above anyway. Might as well remove the check/exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6650) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6650: --- Status: Patch Available (was: Open) hive.optimize.index.filter breaks non-index where with HBaseStorageHandler -- Key: HIVE-6650 URL: https://issues.apache.org/jira/browse/HIVE-6650 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.12.0 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HIVE-6650.0.patch, HIVE-6650.1.patch, HIVE-6650.2.patch With the above enabled, where clauses including non-rowkey columns cannot be used with the HBaseStorageHandler. Job fails to launch with the following exception. {noformat} java.lang.RuntimeException: Unexpected residual predicate (s_address = '200 WEST 56TH STREET') at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.convertFilter(HiveHBaseTableInputFormat.java:292) at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getSplits(HiveHBaseTableInputFormat.java:495) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:294) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:303) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:518) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:510) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:425) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1437) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1215) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1043) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.lang.RuntimeException(Unexpected residual predicate (s_address = '200 WEST 56TH STREET'))' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask {noformat} I believe this bug was introduced in HIVE-2036, see change to OpProcFactory.java that always includes full predicate, even after storage handler negotiates the predicates it can pushdown. Since this behavior is divergent from input formats (they cannot negotiate), there's no harm in the SH ignoring non-indexed predicates -- Hive respects all of them at a layer above anyway. Might as well remove the check/exception. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6693) CASE with INT and BIGINT fail
David Gayou created HIVE-6693: - Summary: CASE with INT and BIGINT fail Key: HIVE-6693 URL: https://issues.apache.org/jira/browse/HIVE-6693 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.12.0 Reporter: David Gayou CREATE TABLE testCase (n BIGINT) select case when (n 3) then n else 0 end from testCase fail with error : [Error 10016]: Line 1:36 Argument type mismatch '0': The expression after ELSE should have the same type as those after THEN: bigint is expected but int is found'. bigint and int should be more compatible, at least int should implictly cast to bigint. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name
[ https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6687: -- Description: Getting value from result set using fully qualified name would throw exception. Only solution today is to use position of the column as opposed to column label. {code} String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y; ResultSet res = stmt.executeQuery(sql); res.getInt(r1.x); {code} res.getInt(r1.x); would throw exception unknown column even though sql specifies it. Fix is to fix resultsetschema in semantic analyzer. was: Getting value from result set using fully qualified name would throw exception. Only solution today is to use position of the column as opposed to column label. String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y; ResultSet res = stmt.executeQuery(sql); res.getInt(r1.x); res.getInt(r1.x); would throw exception unknown column even though sql specifies it. Fix is to fix resultsetschema in semantic analyzer. JDBC ResultSet fails to get value by qualified projection name -- Key: HIVE-6687 URL: https://issues.apache.org/jira/browse/HIVE-6687 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.12.1 Getting value from result set using fully qualified name would throw exception. Only solution today is to use position of the column as opposed to column label. {code} String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y; ResultSet res = stmt.executeQuery(sql); res.getInt(r1.x); {code} res.getInt(r1.x); would throw exception unknown column even though sql specifies it. Fix is to fix resultsetschema in semantic analyzer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6625) HiveServer2 running in http mode should support trusted proxy access
[ https://issues.apache.org/jira/browse/HIVE-6625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939542#comment-13939542 ] Harish Butani commented on HIVE-6625: - +1 for 0.13 HiveServer2 running in http mode should support trusted proxy access Key: HIVE-6625 URL: https://issues.apache.org/jira/browse/HIVE-6625 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-6625.1.patch, HIVE-6625.2.patch HIVE-5155 adds trusted proxy access to HiveServer2. This patch a minor change to have it used when running HiveServer2 in http mode. Patch to be applied on top of HIVE-4764 HIVE-5155. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6660) HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request
[ https://issues.apache.org/jira/browse/HIVE-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939539#comment-13939539 ] Harish Butani commented on HIVE-6660: - +1 for porting to 0.13 branch HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request - Key: HIVE-6660 URL: https://issues.apache.org/jira/browse/HIVE-6660 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Prasad Mujumdar Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6660.1.patch, HIVE-6660.1.patch, hive-site.xml *Beeline connection string:* {code} !connect jdbc:hive2://host:1/;ssl=true;sslTrustStore=/usr/share/doc/hive-0.13.0.2.1.1.0/examples/files/truststore.jks;trustStorePassword=HiveJdbc vgumashta vgumashta org.apache.hive.jdbc.HiveDriver {code} *Error:* {code} pool-7-thread-1, handling exception: java.net.SocketTimeoutException: Read timed out pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) pool-7-thread-1, SEND TLSv1 ALERT: warning, description = close_notify Padded plaintext before ENCRYPTION: len = 32 : 01 00 BE 72 AC 10 3B FA 4E 01 A5 DE 9B 14 16 AF ...r..;.N... 0010: 4E DD 7A 29 AD B4 09 09 09 09 09 09 09 09 09 09 N.z) pool-7-thread-1, WRITE: TLSv1 Alert, length = 32 [Raw write]: length = 37 : 15 03 01 00 20 6C 37 82 A8 52 40 DA FB 83 2D CD l7..R@...-. 0010: 96 9F F0 B7 22 17 E1 04 C1 D1 93 1B C4 39 5A B0 9Z. 0020: A2 3F 5D 7D 2D .?].- pool-7-thread-1, called closeSocket(selfInitiated) pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) {code} *Subsequent queries fail:* {code} main, WRITE: TLSv1 Application Data, length = 144 main, handling exception: java.net.SocketException: Broken pipe %% Invalidated: [Session-1, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] main, SEND TLSv1 ALERT: fatal, description = unexpected_message Padded plaintext before ENCRYPTION: len = 32 : 02 0A 52 C3 18 B1 C1 38 DB 3F B6 D1 C5 CA 14 9C ..R8.?.. 0010: A5 38 4C 01 31 69 09 09 09 09 09 09 09 09 09 09 .8L.1i.. main, WRITE: TLSv1 Alert, length = 32 main, Exception sending alert: java.net.SocketException: Broken pipe main, called closeSocket() Error: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe (state=08S01,code=0) java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:226) at org.apache.hive.beeline.Commands.execute(Commands.java:736) at org.apache.hive.beeline.Commands.sql(Commands.java:657) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:796) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) at org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:471) at org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65) at org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:219) at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:211) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:220) ... 11 more Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) at java.net.SocketOutputStream.write(SocketOutputStream.java:153) at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:377) at
[jira] [Commented] (HIVE-6682) nonstaged mapjoin table memory check may be broken
[ https://issues.apache.org/jira/browse/HIVE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939581#comment-13939581 ] Sergey Shelukhin commented on HIVE-6682: https://reviews.apache.org/r/19363/ although the patch is really small, it's just the q file and result. What do you mean by the question? That is what is done, right? I added config to make sure it's set, because if it's not the job is going to fail on any real data nonstaged mapjoin table memory check may be broken -- Key: HIVE-6682 URL: https://issues.apache.org/jira/browse/HIVE-6682 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6682.patch We are getting the below error from task while the staged load works correctly. We don't set the memory threshold so low so it seems the settings are just not handled correctly. This seems to always trigger on the first check. Given that map task might have bunch more stuff, not just the hashmap, we may also need to adjust the memory check (e.g. have separate configs). {noformat} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 2014-03-14 08:11:21 Processing rows:20 Hashtable size: 19 Memory usage: 204001888 percentage: 0.197 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 2014-03-14 08:11:21 Processing rows:20 Hashtable size: 19 Memory usage: 204001888 percentage: 0.197 at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 8 more Caused by: org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 2014-03-14 08:11:21 Processing rows:20 Hashtable size: 19 Memory usage: 204001888 percentage: 0.197 at org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91) at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346) at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147) at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6643) Add a check for cross products in plans and output a warning
[ https://issues.apache.org/jira/browse/HIVE-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6643: Status: Open (was: Patch Available) Add a check for cross products in plans and output a warning Key: HIVE-6643 URL: https://issues.apache.org/jira/browse/HIVE-6643 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6643.1.patch, HIVE-6643.2.patch Now that we support old style join syntax, it is easy to write queries that generate a plan with a cross product. For e.g. say you have A join B join C join D on A.x = B.x and A.y = D.y and C.z = D.z So the JoinTree is: A — B |__ D — C Since we don't reorder join graphs, we will end up with a cross product between (A join B) and C -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6643) Add a check for cross products in plans and output a warning
[ https://issues.apache.org/jira/browse/HIVE-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6643: Status: Patch Available (was: Open) Add a check for cross products in plans and output a warning Key: HIVE-6643 URL: https://issues.apache.org/jira/browse/HIVE-6643 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-6643.1.patch, HIVE-6643.2.patch Now that we support old style join syntax, it is easy to write queries that generate a plan with a cross product. For e.g. say you have A join B join C join D on A.x = B.x and A.y = D.y and C.z = D.z So the JoinTree is: A — B |__ D — C Since we don't reorder join graphs, we will end up with a cross product between (A join B) and C -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6364) HiveServer2 - Request serving thread should get class loader from existing SessionState
[ https://issues.apache.org/jira/browse/HIVE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939592#comment-13939592 ] Jason Dere commented on HIVE-6364: -- Hi Jaideep, when I tried debugging hiveserver2 due to HIVE-6672, it appeared that there was a thread running for each connection (session). Non-SQL commands (such as ADD JAR), were being run within this session thread and so the classloader for the session thread had the JARs loaded. When a SQL command was executed the session thread would start a new thread, and it appeared that this new thread was using the same classloader (and had the added JARs in the classloader's list of URLs). Were you seeing different behavior in your testing (I was running this on Mac, I think with jdk 1.6, not sure if it would have been different)? In the patch, the thread's classloader is getting set to the HiveConf's classloader .. where is the HiveConf's classloader getting set from? Do we need to worry about having to make sure this classloader is updated whenever a JAR is added to the classpath? HiveServer2 - Request serving thread should get class loader from existing SessionState --- Key: HIVE-6364 URL: https://issues.apache.org/jira/browse/HIVE-6364 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Jaideep Dhok Attachments: HIVE-6364.1.patch SessionState is created for each session in HS2. If we do any add jars, a class loader is set in the SessionState's conf object. This class loader should also be set in each thread that serves request of the same session. Scenario (both requests are in the same session)- {noformat} // req 1 add jar foo.jar // Served by thread th1, this updates class loader and sets in SessionState.conf // req2 served by th2, such that th1 != th2 CREATE TEMPORARY FUNCTION foo_udf AS 'some class in foo.jar' // This can throw class not found error, because although // the new thread (th2) gets the same session state as th1, // the class loader is different (Thread.currentThread.getContextClassLoader() {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939608#comment-13939608 ] Jitendra Nath Pandey commented on HIVE-6639: I have committed this to trunk. [~rhbutani] This bug affects hive-0.13 and fails queries having partitioned columns but no filters. This should be fixed in branch-0.13 as well. Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939613#comment-13939613 ] Harish Butani commented on HIVE-6639: - +1 for 0.13 Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6641) optimized HashMap keys won't work correctly with decimals
[ https://issues.apache.org/jira/browse/HIVE-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6641: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) in trunk and 13 optimized HashMap keys won't work correctly with decimals - Key: HIVE-6641 URL: https://issues.apache.org/jira/browse/HIVE-6641 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.13.0 Attachments: HIVE-6641.patch Decimal values with can be equal while having different byte representations (different precision/scale), so comparing bytes is not enough. For a quick fix, we can disable this for decimals -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6694) Beeline should provide a way to execute shell command as Hive CLI does
Xuefu Zhang created HIVE-6694: - Summary: Beeline should provide a way to execute shell command as Hive CLI does Key: HIVE-6694 URL: https://issues.apache.org/jira/browse/HIVE-6694 Project: Hive Issue Type: Improvement Components: CLI, Clients Affects Versions: 0.12.0, 0.11.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Hive CLI allows a user to execute a shell command using ! notation. For instance, !cat myfile.txt. Being able to execute shell command may be important for some users. As a replacement, however, Beeline provides no such capability, possibly because ! notation is reserved for SQLLine commands. It's possible to provide this using a slightly syntactic variation such as !sh cat myfilie.txt. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) I have committed this to trunk and branch-0.13. Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6639) Vectorization: Partition column names are not picked up.
[ https://issues.apache.org/jira/browse/HIVE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6639: --- Affects Version/s: 0.13.0 Vectorization: Partition column names are not picked up. Key: HIVE-6639 URL: https://issues.apache.org/jira/browse/HIVE-6639 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.13.0 Attachments: HIVE-6639.2.patch, HIVE-6639.3.patch, HIVE-6639.4.patch, HIVE-6639.5.patch, HIVE-6639.5.patch, HIVE-6639.6.patch The vectorized plan generation finds the list of partitioning columns from pruned-partition-list using table scan operator. In some cases the list is coming as null. TPCDS query 27 can reproduce this issue if the store_sales table is partitioned on ss_store_sk. The exception stacktrace is : {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getInputColumnIndex(VectorizationContext.java:166) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getColumnVectorExpression(VectorizationContext.java:240) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:287) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:267) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressions(VectorizationContext.java:255) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.init(VectorMapJoinOperator.java:116) ... 42 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6690) NPE in tez session state
[ https://issues.apache.org/jira/browse/HIVE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6690: - Priority: Critical (was: Major) NPE in tez session state Key: HIVE-6690 URL: https://issues.apache.org/jira/browse/HIVE-6690 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.13.0 Attachments: HIVE-6690.patch If hive.jar.directory isn't set hive will throw NPE in startup with tez: Exception in thread main java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:344) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:682) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createHiveExecLocalResource(TezSessionState.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:130) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:342) ... 7 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6690) NPE in tez session state
[ https://issues.apache.org/jira/browse/HIVE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6690: - Fix Version/s: 0.13.0 NPE in tez session state Key: HIVE-6690 URL: https://issues.apache.org/jira/browse/HIVE-6690 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.13.0 Attachments: HIVE-6690.patch If hive.jar.directory isn't set hive will throw NPE in startup with tez: Exception in thread main java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:344) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:682) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createHiveExecLocalResource(TezSessionState.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:130) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:342) ... 7 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6690) NPE in tez session state
[ https://issues.apache.org/jira/browse/HIVE-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939769#comment-13939769 ] Gunther Hagleitner commented on HIVE-6690: -- +1 LGTM NPE in tez session state Key: HIVE-6690 URL: https://issues.apache.org/jira/browse/HIVE-6690 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.13.0 Attachments: HIVE-6690.patch If hive.jar.directory isn't set hive will throw NPE in startup with tez: Exception in thread main java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:344) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:682) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.createHiveExecLocalResource(TezSessionState.java:303) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:130) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:342) ... 7 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6331) HIVE-5279 deprecated UDAF class without explanation/documentation/alternative
[ https://issues.apache.org/jira/browse/HIVE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939793#comment-13939793 ] Hive QA commented on HIVE-6331: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12635056/HIVE-6331.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5411 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1868/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1868/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12635056 HIVE-5279 deprecated UDAF class without explanation/documentation/alternative - Key: HIVE-6331 URL: https://issues.apache.org/jira/browse/HIVE-6331 Project: Hive Issue Type: Bug Reporter: Lars Francke Assignee: Lars Francke Priority: Minor Attachments: HIVE-5279.1.patch, HIVE-6331.2.patch HIVE-5279 added a @Deprecated annotation to the {{UDAF}} class. The comment in that class says {quote}UDAF classes are REQUIRED to inherit from this class.{quote} One of these two needs to be updated. Either remove the annotation or document why it was deprecated and what to use instead. Unfortunately [~navis] did not leave any documentation about his intentions. I'm happy to provide a patch once I know the intentions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-4293: Attachment: HIVE-4293.13.patch Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.10.patch, HIVE-4293.11.patch.txt, HIVE-4293.12.patch, HIVE-4293.13.patch, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-4293: Status: Patch Available (was: Open) Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.10.patch, HIVE-4293.11.patch.txt, HIVE-4293.12.patch, HIVE-4293.13.patch, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-4293: Status: Open (was: Patch Available) Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.10.patch, HIVE-4293.11.patch.txt, HIVE-4293.12.patch, HIVE-4293.13.patch, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4293) Predicates following UDTF operator are removed by PPD
[ https://issues.apache.org/jira/browse/HIVE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939816#comment-13939816 ] Harish Butani commented on HIVE-4293: - ran tests locally, sq_notin_having.q.out has changed. Predicates following UDTF operator are removed by PPD - Key: HIVE-4293 URL: https://issues.apache.org/jira/browse/HIVE-4293 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Attachments: D9933.6.patch, HIVE-4293.10.patch, HIVE-4293.11.patch.txt, HIVE-4293.12.patch, HIVE-4293.13.patch, HIVE-4293.7.patch.txt, HIVE-4293.8.patch.txt, HIVE-4293.9.patch.txt, HIVE-4293.D9933.1.patch, HIVE-4293.D9933.2.patch, HIVE-4293.D9933.3.patch, HIVE-4293.D9933.4.patch, HIVE-4293.D9933.5.patch For example, {noformat} explain SELECT value from ( select explode(array(key, value)) as (value) from ( select * FROM src WHERE key 200 ) A ) B WHERE value 300 ; {noformat} Makes plan like this, removing last predicates {noformat} TableScan alias: src Filter Operator predicate: expr: (key 200.0) type: boolean Select Operator expressions: expr: array(key,value) type: arraystring outputColumnNames: _col0 UDTF Operator function name: explode Select Operator expressions: expr: col type: string outputColumnNames: _col0 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 19373: Limit table partitions involved in a table scan
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19373/ --- Review request for hive and Gunther Hagleitner. Bugs: HIVE-6492 https://issues.apache.org/jira/browse/HIVE-6492 Repository: hive-git Description --- Introduce a new configure parameter to limit the table partitions involved in a table scan. It applies to select * query and any queries need issue MR jobs. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java edc3d38 ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java ecd4c5d ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ecce21e ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java 7f2bb60 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 73603ab ql/src/test/queries/clientnegative/limit_partition.q PRE-CREATION ql/src/test/queries/clientnegative/limit_partition_stats.q PRE-CREATION ql/src/test/queries/clientpositive/limit_partition_metadataonly.q PRE-CREATION ql/src/test/results/clientnegative/limit_partition.q.out PRE-CREATION ql/src/test/results/clientnegative/limit_partition_stats.q.out PRE-CREATION ql/src/test/results/clientpositive/limit_partition_metadataonly.q.out PRE-CREATION Diff: https://reviews.apache.org/r/19373/diff/ Testing --- 3 tests are added Thanks, Selina Zhang
[jira] [Updated] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [CLONE of HCATALOG-621
[ https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6695: --- Description: This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ {noformat} What we need is more like: {noformat} $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} was: This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ What we need is more like: $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} bin/hcat should include hbase jar and dependencies in the classpath [CLONE of HCATALOG-621 -- Key: HIVE-6695 URL: https://issues.apache.org/jira/browse/HIVE-6695 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Nick Dimiduk This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ {noformat} What we need is more like: {noformat} $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [CLONE of HCATALOG-621
Sushanth Sowmyan created HIVE-6695: -- Summary: bin/hcat should include hbase jar and dependencies in the classpath [CLONE of HCATALOG-621 Key: HIVE-6695 URL: https://issues.apache.org/jira/browse/HIVE-6695 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Nick Dimiduk This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ What we need is more like: $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6613) Control when spcific Inputs / Outputs are started
[ https://issues.apache.org/jira/browse/HIVE-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939919#comment-13939919 ] Gunther Hagleitner commented on HIVE-6613: -- +1 LGTM Control when spcific Inputs / Outputs are started - Key: HIVE-6613 URL: https://issues.apache.org/jira/browse/HIVE-6613 Project: Hive Issue Type: Improvement Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: HIVE-6613.2.txt, HIVE-6613.3.patch, TEZ-6613.1.txt When running with Tez - a couple of enhancement are possible 1) Avoid re-fetching data in case of MapJoins - since the data is likely to be cached after the first run (container re-use for the same query) 2) Start Outputs only after required Inputs are ready - specifically useful in case of Reduce - where shuffle requires a large memory, and the Output (if it's a sorted output) also requires a fair amount of memory. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]
[ https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6695: --- Summary: bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621] (was: bin/hcat should include hbase jar and dependencies in the classpath [CLONE of HCATALOG-621) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621] Key: HIVE-6695 URL: https://issues.apache.org/jira/browse/HIVE-6695 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Nick Dimiduk This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ {noformat} What we need is more like: {noformat} $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]
[ https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6695: --- Attachment: HIVE-6695.patch (Attaching addendum patch that was uploaded to HCATALOG-621) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621] Key: HIVE-6695 URL: https://issues.apache.org/jira/browse/HIVE-6695 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Nick Dimiduk Attachments: HIVE-6695.patch This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ {noformat} What we need is more like: {noformat} $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6468) HS2 out of memory error when curl sends a get request
[ https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939925#comment-13939925 ] Lefty Leverenz commented on HIVE-6468: -- Thanks, I put it in a warning box with this wording: In remote mode HiveServer2 only accepts valid Thrift calls – do not attempt to call it via http or telnet (HIVE-6468). Should it also explain that HS2 will die, or is that just until this jira's patch gets added? Readers can click the link to this jira if they want to know the reason for the warning, but we could make it explicit if you think that's better. By the way, *hive.server2.sasl.message.limit* needs some user doc. It can go in a HiveConf.java comment for now, or in a release note, until we know when HIVE-6037 will get committed. Quick ref: * [new warning in Beeline – New Command Line Shell |https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Beeline–NewCommandLineShell] * [page history: new changes |https://cwiki.apache.org/confluence/pages/diffpages.action?pageId=30758725originalId=40505296] HS2 out of memory error when curl sends a get request - Key: HIVE-6468 URL: https://issues.apache.org/jira/browse/HIVE-6468 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Environment: Centos 6.3, hive 12, hadoop-2.2 Reporter: Abin Shahab Assignee: Navis Attachments: HIVE-6468.1.patch.txt We see an out of memory error when we run simple beeline calls. (The hive.server2.transport.mode is binary) curl localhost:1 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]
[ https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939932#comment-13939932 ] Sushanth Sowmyan commented on HIVE-6695: [~ndimiduk], I created this hive jira since I was not able to respond on HCATALOG-621, since that seems like it's been locked down. +1 to the change, I'll go ahead and commit it. I've experimented with both versions of the find command, and both work for me (with and without quotes, and in fact, I'm more used to the backslash notation). I'm using findutils-4.4.2-6.el6.x86_64. The main difference though, was that our offending jar is libthrift\*jar, not thrift\*jar. bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621] Key: HIVE-6695 URL: https://issues.apache.org/jira/browse/HIVE-6695 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Nick Dimiduk Attachments: HIVE-6695.patch This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ {noformat} What we need is more like: {noformat} $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6696) Implement DBMD.getIndexInfo()
Jonathan Seidman created HIVE-6696: -- Summary: Implement DBMD.getIndexInfo() Key: HIVE-6696 URL: https://issues.apache.org/jira/browse/HIVE-6696 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.12.0 Reporter: Jonathan Seidman Priority: Minor HiveDatabaseMetaData.getIndexInfo() currently throws a not supported exception. There seems to be no technical obstacle to implementing this to return index info for tables with indexes defined, and probably an empty ResultSet for tables with no indexes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name
[ https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-6687: - Attachment: HIVE-6687.patch JDBC ResultSet fails to get value by qualified projection name -- Key: HIVE-6687 URL: https://issues.apache.org/jira/browse/HIVE-6687 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.12.1 Attachments: HIVE-6687.patch Getting value from result set using fully qualified name would throw exception. Only solution today is to use position of the column as opposed to column label. {code} String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y; ResultSet res = stmt.executeQuery(sql); res.getInt(r1.x); {code} res.getInt(r1.x); would throw exception unknown column even though sql specifies it. Fix is to fix resultsetschema in semantic analyzer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6687) JDBC ResultSet fails to get value by qualified projection name
[ https://issues.apache.org/jira/browse/HIVE-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-6687: - Status: Patch Available (was: Open) JDBC ResultSet fails to get value by qualified projection name -- Key: HIVE-6687 URL: https://issues.apache.org/jira/browse/HIVE-6687 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.12.1 Attachments: HIVE-6687.patch Getting value from result set using fully qualified name would throw exception. Only solution today is to use position of the column as opposed to column label. {code} String sql = select r1.x, r2.x from r1 join r2 on r1.y=r2.y; ResultSet res = stmt.executeQuery(sql); res.getInt(r1.x); {code} res.getInt(r1.x); would throw exception unknown column even though sql specifies it. Fix is to fix resultsetschema in semantic analyzer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]
[ https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan resolved HIVE-6695. Resolution: Fixed Committed. Thanks, Nick! bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621] Key: HIVE-6695 URL: https://issues.apache.org/jira/browse/HIVE-6695 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Nick Dimiduk Attachments: HIVE-6695.patch This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ {noformat} What we need is more like: {noformat} $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6645) to_date()/to_unix_timestamp() fail with NPE if input is null
[ https://issues.apache.org/jira/browse/HIVE-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939981#comment-13939981 ] Ashutosh Chauhan commented on HIVE-6645: +1 to_date()/to_unix_timestamp() fail with NPE if input is null Key: HIVE-6645 URL: https://issues.apache.org/jira/browse/HIVE-6645 Project: Hive Issue Type: Bug Components: UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6645.1.patch, HIVE-6645.2.patch, HIVE-6645.2.patch {noformat} hive describe tab2; Query ID = jdere_20140312185454_e3ed213e-8b3a-4963-b815-19965edad587 OK c1timestamp None Time taken: 0.155 seconds, Fetched: 1 row(s) hive select * from tab2; Query ID = jdere_20140312185454_8a009070-df79-45de-8642-e85668a378d7 OK NULL NULL NULL NULL NULL Time taken: 0.067 seconds, Fetched: 5 row(s) hive select to_unix_timestamp(c1) from tab2; hive select to_date(c1) from tab2; {noformat} Fails with errors like: {noformat} java.lang.Exception: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {c1:null} at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:401) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {c1:null} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:233) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:680) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {c1:null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) ... 10 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating to_date(c1) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524) ... 11 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFDate.evaluate(GenericUDFDate.java:106) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6695) bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621]
[ https://issues.apache.org/jira/browse/HIVE-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939986#comment-13939986 ] Nick Dimiduk commented on HIVE-6695: Thanks [~sushanth]! bin/hcat should include hbase jar and dependencies in the classpath [followup/clone of HCATALOG-621] Key: HIVE-6695 URL: https://issues.apache.org/jira/browse/HIVE-6695 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Nick Dimiduk Attachments: HIVE-6695.patch This is to address the addendum of HCATALOG-621, now that the HCatalog jira seems to be in read-only mode. To quote Nick from the original bug: I'm not sure how this fixes anything for the error listed above. The find command in the script we merged is broken, at least on linux. Maybe it worked with BSD find and we both tested on Macs? From the patch we committed: {noformat} if [ -d ${HBASE_HOME} ] ; then for jar in $(find $HBASE_HOME -name *.jar -not -name thrift\*.jar); do HBASE_CLASSPATH=$HBASE_CLASSPATH:${jar} done export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:${HBASE_CLASSPATH} fi {noformat} The find command syntax is wrong – it returns no jars ever. {noformat} $ find /usr/lib/hbase -name *.jar $ find /usr/lib/hbase -name *.jar -not -name thrift\*.jar $ {noformat} What we need is more like: {noformat} $ find /usr/lib/hbase -name '*.jar' ... // prints lots of jars $ find /usr/lib/hbase -name '*.jar' | grep thrift /usr/lib/hbase/lib/libthrift-0.9.0.jar $ find /usr/lib/hbase -name '*.jar' -not -name '*thrift*' | grep thrift $ {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6660) HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request
[ https://issues.apache.org/jira/browse/HIVE-6660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-6660: -- Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to trunk and 0.13 release branch. Thanks Thejas and Vaibhav for review HiveServer2 running in non-http mode closes server socket for an SSL connection after the 1st request - Key: HIVE-6660 URL: https://issues.apache.org/jira/browse/HIVE-6660 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Prasad Mujumdar Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6660.1.patch, HIVE-6660.1.patch, hive-site.xml *Beeline connection string:* {code} !connect jdbc:hive2://host:1/;ssl=true;sslTrustStore=/usr/share/doc/hive-0.13.0.2.1.1.0/examples/files/truststore.jks;trustStorePassword=HiveJdbc vgumashta vgumashta org.apache.hive.jdbc.HiveDriver {code} *Error:* {code} pool-7-thread-1, handling exception: java.net.SocketTimeoutException: Read timed out pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) pool-7-thread-1, SEND TLSv1 ALERT: warning, description = close_notify Padded plaintext before ENCRYPTION: len = 32 : 01 00 BE 72 AC 10 3B FA 4E 01 A5 DE 9B 14 16 AF ...r..;.N... 0010: 4E DD 7A 29 AD B4 09 09 09 09 09 09 09 09 09 09 N.z) pool-7-thread-1, WRITE: TLSv1 Alert, length = 32 [Raw write]: length = 37 : 15 03 01 00 20 6C 37 82 A8 52 40 DA FB 83 2D CD l7..R@...-. 0010: 96 9F F0 B7 22 17 E1 04 C1 D1 93 1B C4 39 5A B0 9Z. 0020: A2 3F 5D 7D 2D .?].- pool-7-thread-1, called closeSocket(selfInitiated) pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) pool-7-thread-1, called close() pool-7-thread-1, called closeInternal(true) {code} *Subsequent queries fail:* {code} main, WRITE: TLSv1 Application Data, length = 144 main, handling exception: java.net.SocketException: Broken pipe %% Invalidated: [Session-1, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] main, SEND TLSv1 ALERT: fatal, description = unexpected_message Padded plaintext before ENCRYPTION: len = 32 : 02 0A 52 C3 18 B1 C1 38 DB 3F B6 D1 C5 CA 14 9C ..R8.?.. 0010: A5 38 4C 01 31 69 09 09 09 09 09 09 09 09 09 09 .8L.1i.. main, WRITE: TLSv1 Alert, length = 32 main, Exception sending alert: java.net.SocketException: Broken pipe main, called closeSocket() Error: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe (state=08S01,code=0) java.sql.SQLException: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:226) at org.apache.hive.beeline.Commands.execute(Commands.java:736) at org.apache.hive.beeline.Commands.sql(Commands.java:657) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:796) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:659) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:368) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:351) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) at org.apache.thrift.transport.TSaslTransport.flush(TSaslTransport.java:471) at org.apache.thrift.transport.TSaslClientTransport.flush(TSaslClientTransport.java:37) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65) at org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:219) at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:211) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:220) ... 11 more Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) at java.net.SocketOutputStream.write(SocketOutputStream.java:153) at
[jira] [Commented] (HIVE-3969) Session state for hive server should be cleaned-up
[ https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940009#comment-13940009 ] Navis commented on HIVE-3969: - Ok, I'll take a look. I wish HIVE-3907 would be considered, too. Session state for hive server should be cleaned-up -- Key: HIVE-3969 URL: https://issues.apache.org/jira/browse/HIVE-3969 Project: Hive Issue Type: Bug Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3969.D8325.1.patch Currently add jar command by clients are adding child ClassLoader to worker thread cumulatively, causing various problems. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6364) HiveServer2 - Request serving thread should get class loader from existing SessionState
[ https://issues.apache.org/jira/browse/HIVE-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940018#comment-13940018 ] Jaideep Dhok commented on HIVE-6364: [~ashutoshc] I will put up a new patch. [~jdere] Add jar will always update the class loader. That's the current behaviour. I think the first class loader is set using the conf.getClassLoader method, if nothing is set it will return the default class loader. HiveServer2 - Request serving thread should get class loader from existing SessionState --- Key: HIVE-6364 URL: https://issues.apache.org/jira/browse/HIVE-6364 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Jaideep Dhok Attachments: HIVE-6364.1.patch SessionState is created for each session in HS2. If we do any add jars, a class loader is set in the SessionState's conf object. This class loader should also be set in each thread that serves request of the same session. Scenario (both requests are in the same session)- {noformat} // req 1 add jar foo.jar // Served by thread th1, this updates class loader and sets in SessionState.conf // req2 served by th2, such that th1 != th2 CREATE TEMPORARY FUNCTION foo_udf AS 'some class in foo.jar' // This can throw class not found error, because although // the new thread (th2) gets the same session state as th1, // the class loader is different (Thread.currentThread.getContextClassLoader() {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego
Dilli Arumugam created HIVE-6697: Summary: HiveServer2 secure thrift/http authentication needs to support SPNego Key: HIVE-6697 URL: https://issues.apache.org/jira/browse/HIVE-6697 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Dilli Arumugam Looking to integrating Apache Knox to work with HiveServer2 secure thrift/http. Found that thrift/http uses some form of Kerberos authentication that is not SPNego. Considering it is going over http protocol, expected it to use SPNego protocol. Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase Stargate using SPNego for authentication. Requesting that HiveServer2 secure thrift/http authentication support SPNego. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-6697) HiveServer2 secure thrift/http authentication needs to support SPNego
[ https://issues.apache.org/jira/browse/HIVE-6697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta reassigned HIVE-6697: -- Assignee: Vaibhav Gumashta HiveServer2 secure thrift/http authentication needs to support SPNego -- Key: HIVE-6697 URL: https://issues.apache.org/jira/browse/HIVE-6697 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Dilli Arumugam Assignee: Vaibhav Gumashta Looking to integrating Apache Knox to work with HiveServer2 secure thrift/http. Found that thrift/http uses some form of Kerberos authentication that is not SPNego. Considering it is going over http protocol, expected it to use SPNego protocol. Apache Knox is already integrated with WebHDFS, WebHCat, Oozie and HBase Stargate using SPNego for authentication. Requesting that HiveServer2 secure thrift/http authentication support SPNego. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6676) hcat cli fails to run when running with hive on tez
[ https://issues.apache.org/jira/browse/HIVE-6676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-6676: Fix Version/s: 0.13.0 hcat cli fails to run when running with hive on tez --- Key: HIVE-6676 URL: https://issues.apache.org/jira/browse/HIVE-6676 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Blocker Fix For: 0.13.0 Attachments: HIVE-6676.patch HIVE_CLASSPATH should be added to HADOOP_CLASSPATH before launching hcat CLI -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 19322: HIVE-6685 Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19322/ --- (Updated March 19, 2014, 1:25 a.m.) Review request for hive and Xuefu Zhang. Changes --- Refactored the arg-passing from manual list iteration, to use a simple extension of GNUParser. Mostly borrowing the code from HiveCLI. It is needed to extend the GNUParser because they dont support unknown arguments. In beeline case, these are the 'property-files' and the reflectively-set BeelineOpts like 'autoCommit', etc. Adding a unit test to verify the parsing doesn't break anything. Bugs: HIVE-6685 https://issues.apache.org/jira/browse/HIVE-6685 Repository: hive-git Description --- Improving the error-handling in ArrayIndexOutOfBoundsException of Beeline. Diffs (updated) - beeline/src/java/org/apache/hive/beeline/BeeLine.java 3482186 beeline/src/test/org/apache/hive/beeline/TestBeelineArgParsing.java PRE-CREATION Diff: https://reviews.apache.org/r/19322/diff/ Testing --- Manual test. Now, in this scenario it will display the usage like: beeline -u Usage: java org.apache.hive.cli.beeline.BeeLine -u database url the JDBC URL to connect to -n username the username to connect as -p password the password to connect as ... Thanks, Szehon Ho
[jira] [Updated] (HIVE-6685) Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments
[ https://issues.apache.org/jira/browse/HIVE-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6685: Attachment: HIVE-6685.2.patch Thanks for the review and suggestion. I refactored Beeline to use the GNU Parser, it is a much cleaner solution. Beeline throws ArrayIndexOutOfBoundsException for mismatched arguments -- Key: HIVE-6685 URL: https://issues.apache.org/jira/browse/HIVE-6685 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.12.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-6685.2.patch, HIVE-6685.patch Noticed that there is an ugly ArrayIndexOutOfBoundsException for mismatched arguments in beeline prompt. It would be nice to cleanup. Example: {noformat} beeline -u szehon -p Exception in thread main java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:560) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:628) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:366) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:349) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 19387: Session state for hive server should be cleaned-up
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19387/ --- Review request for hive and Ashutosh Chauhan. Bugs: HIVE-3969 https://issues.apache.org/jira/browse/HIVE-3969 Repository: hive-git Description --- Currently add jar command by clients are adding child ClassLoader to worker thread cumulatively, causing various problems. Diffs - common/src/java/org/apache/hadoop/hive/common/JavaUtils.java 0dba331 ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java a249d74 ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/LineageCtx.java 8d6ebaa ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 5546d03 ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/TableFunctionEvaluator.java 32e78ac ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPNumeric.java 2ada2ff service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java ace791a service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java ef5b5c6 Diff: https://reviews.apache.org/r/19387/diff/ Testing --- Thanks, Navis Ryu
[jira] [Commented] (HIVE-3969) Session state for hive server should be cleaned-up
[ https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940052#comment-13940052 ] Navis commented on HIVE-3969: - [~ashutoshc] I just fixed hiveserver2 case. Let's deprecate old hiveserver. Session state for hive server should be cleaned-up -- Key: HIVE-3969 URL: https://issues.apache.org/jira/browse/HIVE-3969 Project: Hive Issue Type: Bug Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3969.1.patch.txt, HIVE-3969.D8325.1.patch Currently add jar command by clients are adding child ClassLoader to worker thread cumulatively, causing various problems. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3969) Session state for hive server should be cleaned-up
[ https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3969: Attachment: HIVE-3969.1.patch.txt Session state for hive server should be cleaned-up -- Key: HIVE-3969 URL: https://issues.apache.org/jira/browse/HIVE-3969 Project: Hive Issue Type: Bug Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3969.1.patch.txt, HIVE-3969.D8325.1.patch Currently add jar command by clients are adding child ClassLoader to worker thread cumulatively, causing various problems. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6677) HBaseSerDe needs to be refactored
[ https://issues.apache.org/jira/browse/HIVE-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6677: -- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks to Prasad for the review. HBaseSerDe needs to be refactored - Key: HIVE-6677 URL: https://issues.apache.org/jira/browse/HIVE-6677 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0, 0.11.0, 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 0.14.0 Attachments: HIVE-6677.1.patch, HIVE-6677.2.patch, HIVE-6677.3.patch, HIVE-6677.patch The code in HBaseSerde seems very complex and hard to be extend to support new features such as adding generic compound key (HIVE-6411) and Compound key filter (HIVE-6290), especially when handling key/field serialization. Hope this task will clean up the code a bit and make it ready for new extensions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead
[ https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6430: --- Attachment: HIVE-6430.05.patch rebase, incorporate not enabling for decimal MapJoin hash table has large memory overhead Key: HIVE-6430 URL: https://issues.apache.org/jira/browse/HIVE-6430 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, HIVE-6430.patch Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 for row) can take several hundred bytes, which is ridiculous. I am reducing the size of MJKey and MJRowContainer in other jiras, but in general we don't need to have java hash table there. We can either use primitive-friendly hashtable like the one from HPPC (Apache-licenced), or some variation, to map primitive keys to single row storage structure without an object per row (similar to vectorization). -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/ --- (Updated March 19, 2014, 2:40 a.m.) Review request for hive, Gopal V and Gunther Hagleitner. Repository: hive-git Description --- See JIRA Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b0f5c49 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 2cd65cb hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 704fcb9 ql/src/java/org/apache/hadoop/hive/ql/Driver.java 7dbb8be ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 170e8c0 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 3ea9c96 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java 8854b19 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 9df425b ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 64f0be2 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 008a8db ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 988959f ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 55b7415 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 79af08d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java eef7656 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java 0fd4983 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java 65e3779 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 755d783 ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out d79b984 ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 3c55b5c ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 5870884 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java bab505e serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 6f344bb serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java a99c7b4 serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 435d6c6 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 82c1263 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java b188c3f serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUtils.java 6c14081 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java 06d5c5e serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java 868dd4c serde/src/test/org/apache/hadoop/hive/serde2/thrift_test/CreateSequenceFile.java 1fb49e5 Diff: https://reviews.apache.org/r/18936/diff/ Testing --- Thanks, Sergey Shelukhin
Re: Review Request 18936: HIVE-6430 MapJoin hash table has large memory overhead
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/#review37681 --- ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java https://reviews.apache.org/r/18936/#comment69316 should have been changed, will do - Sergey Shelukhin On March 19, 2014, 2:40 a.m., Sergey Shelukhin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18936/ --- (Updated March 19, 2014, 2:40 a.m.) Review request for hive, Gopal V and Gunther Hagleitner. Repository: hive-git Description --- See JIRA Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b0f5c49 hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 2cd65cb hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 704fcb9 ql/src/java/org/apache/hadoop/hive/ql/Driver.java 7dbb8be ql/src/java/org/apache/hadoop/hive/ql/debug/Utils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 170e8c0 ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 3ea9c96 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java 8854b19 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java 9df425b ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java 64f0be2 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinPersistableTableContainer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java 008a8db ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 988959f ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java 55b7415 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 79af08d ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java eef7656 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java 0fd4983 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java 118b339 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestBytesBytesMultiHashMap.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java 65e3779 ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java 755d783 ql/src/test/queries/clientpositive/mapjoin_decimal.q b65a7be ql/src/test/queries/clientpositive/mapjoin_mapjoin.q 1eb95f6 ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out d79b984 ql/src/test/results/clientpositive/tez/mapjoin_decimal.q.out 3c55b5c ql/src/test/results/clientpositive/tez/mapjoin_mapjoin.q.out 284cc03 serde/src/java/org/apache/hadoop/hive/serde2/ByteStream.java 73d9b29 serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/columnar/ColumnarSerDe.java 5870884 serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java bab505e serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 6f344bb serde/src/java/org/apache/hadoop/hive/serde2/io/DateWritable.java 1f4ccdd serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java a99c7b4 serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java 435d6c6 serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 82c1263 serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java b188c3f serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUtils.java 6c14081 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java 06d5c5e serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java 868dd4c serde/src/test/org/apache/hadoop/hive/serde2/thrift_test/CreateSequenceFile.java 1fb49e5 Diff: https://reviews.apache.org/r/18936/diff/ Testing --- Thanks, Sergey Shelukhin
[jira] [Commented] (HIVE-5998) Add vectorized reader for Parquet files
[ https://issues.apache.org/jira/browse/HIVE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940130#comment-13940130 ] Hive QA commented on HIVE-5998: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12635061/HIVE-5998.10.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5412 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1869/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1869/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12635061 Add vectorized reader for Parquet files --- Key: HIVE-5998 URL: https://issues.apache.org/jira/browse/HIVE-5998 Project: Hive Issue Type: Sub-task Components: Serializers/Deserializers, Vectorization Reporter: Remus Rusanu Assignee: Remus Rusanu Priority: Minor Labels: Parquet, vectorization Attachments: HIVE-5998.1.patch, HIVE-5998.10.patch, HIVE-5998.2.patch, HIVE-5998.3.patch, HIVE-5998.4.patch, HIVE-5998.5.patch, HIVE-5998.6.patch, HIVE-5998.7.patch, HIVE-5998.8.patch, HIVE-5998.9.patch HIVE-5783 is adding native Parquet support in Hive. As Parquet is a columnar format, it makes sense to provide a vectorized reader, similar to how RC and ORC formats have, to benefit from vectorized execution engine. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6579) HiveLockObjectData constructor makes too many queryStr instance causing oom
[ https://issues.apache.org/jira/browse/HIVE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xieyuchen updated HIVE-6579: Attachment: HIVE-6579.02.patch HiveLockObjectData constructor makes too many queryStr instance causing oom --- Key: HIVE-6579 URL: https://issues.apache.org/jira/browse/HIVE-6579 Project: Hive Issue Type: Improvement Reporter: xieyuchen Attachments: HIVE-6579.02.patch, HIVE-6579.1.patch.txt We have a huge sql which full outer joins 10+ partitoned tables, each table has at least 1k partitions. The sql has 300kb in length(it constructed automatically of cause). So when we running this sql, there are over 10k HiveLockObjectData instances. Because of the constructor of HiveLockObjectData trim the queryStr, there will be 10k individual String instances, each has 300kb in length! Then the Hive client will get an oom exception. Trying to trim the queryStr in Driver.compile function instead of doing it in HiveLockObjectData constructor to reduce memory wasting. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6579) HiveLockObjectData constructor makes too many queryStr instance causing oom
[ https://issues.apache.org/jira/browse/HIVE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xieyuchen updated HIVE-6579: Status: Open (was: Patch Available) HiveLockObjectData constructor makes too many queryStr instance causing oom --- Key: HIVE-6579 URL: https://issues.apache.org/jira/browse/HIVE-6579 Project: Hive Issue Type: Improvement Reporter: xieyuchen Attachments: HIVE-6579.02.patch, HIVE-6579.1.patch.txt We have a huge sql which full outer joins 10+ partitoned tables, each table has at least 1k partitions. The sql has 300kb in length(it constructed automatically of cause). So when we running this sql, there are over 10k HiveLockObjectData instances. Because of the constructor of HiveLockObjectData trim the queryStr, there will be 10k individual String instances, each has 300kb in length! Then the Hive client will get an oom exception. Trying to trim the queryStr in Driver.compile function instead of doing it in HiveLockObjectData constructor to reduce memory wasting. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6579) HiveLockObjectData constructor makes too many queryStr instance causing oom
[ https://issues.apache.org/jira/browse/HIVE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xieyuchen updated HIVE-6579: Status: Patch Available (was: Open) HiveLockObjectData constructor makes too many queryStr instance causing oom --- Key: HIVE-6579 URL: https://issues.apache.org/jira/browse/HIVE-6579 Project: Hive Issue Type: Improvement Reporter: xieyuchen Attachments: HIVE-6579.02.patch, HIVE-6579.1.patch.txt We have a huge sql which full outer joins 10+ partitoned tables, each table has at least 1k partitions. The sql has 300kb in length(it constructed automatically of cause). So when we running this sql, there are over 10k HiveLockObjectData instances. Because of the constructor of HiveLockObjectData trim the queryStr, there will be 10k individual String instances, each has 300kb in length! Then the Hive client will get an oom exception. Trying to trim the queryStr in Driver.compile function instead of doing it in HiveLockObjectData constructor to reduce memory wasting. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3969) Session state for hive server should be cleaned-up
[ https://issues.apache.org/jira/browse/HIVE-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940208#comment-13940208 ] Navis commented on HIVE-3969: - Didn't verified fd leakage. Just verified newly created loaders are released when session is closed. Session state for hive server should be cleaned-up -- Key: HIVE-3969 URL: https://issues.apache.org/jira/browse/HIVE-3969 Project: Hive Issue Type: Bug Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3969.1.patch.txt, HIVE-3969.D8325.1.patch Currently add jar command by clients are adding child ClassLoader to worker thread cumulatively, causing various problems. -- This message was sent by Atlassian JIRA (v6.2#6252)