[jira] [Commented] (HIVE-15778) DROP INDEX (non-existent) throws NPE when using DbNotificationListener

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848090#comment-15848090
 ] 

Hive QA commented on HIVE-15778:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850369/HIVE-15778.v0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11017 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3297/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3297/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3297/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850369 - PreCommit-HIVE-Build

> DROP INDEX (non-existent) throws NPE when using DbNotificationListener 
> ---
>
> Key: HIVE-15778
> URL: https://issues.apache.org/jira/browse/HIVE-15778
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Vamsee Yarlagadda
>Assignee: Vamsee Yarlagadda
> Attachments: HIVE-15778.v0.patch
>
>
> Trying to execute a DROP INDEX operation on a non-existant index throws NPE.  
> {code}
> 0: jdbc:hive2://nightly-unsecure-1.gce.cloude> DROP INDEX IF EXISTS vamsee1 
> ON sample_07;
> INFO  : Compiling 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4): 
> DROP INDEX IF EXISTS vamsee1 ON sample_07
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4); 
> Time taken: 0.238 seconds
> INFO  : Executing 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4): 
> DROP INDEX IF EXISTS vamsee1 ON sample_07
> INFO  : Starting task [Stage-0:DDL] in serial mode
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException
> INFO  : Completed executing 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4); 
> Time taken: 0.061 seconds
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException 
> (state=08S01,code=1)
> {code}
> HMS log:
> {code}
> 2017-01-31 16:27:29,421 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-3]: 
> MetaException(message:java.lang.NullPointerException)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5823)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.rethrowException(HiveMetaStore.java:4892)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_index_by_name(HiveMetaStore.java:4403)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy16.drop_index_by_name(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_index_by_name.getResult(ThriftHiveMetastore.java:10803)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_index_by_name.getResult(ThriftHiveMetastore.java:10787)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> 

[jira] [Commented] (HIVE-15672) LLAP text cache: improve first query perf II

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848065#comment-15848065
 ] 

Hive QA commented on HIVE-15672:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850366/HIVE-15672.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11017 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3296/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3296/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3296/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850366 - PreCommit-HIVE-Build

> LLAP text cache: improve first query perf II
> 
>
> Key: HIVE-15672
> URL: https://issues.apache.org/jira/browse/HIVE-15672
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15672.01.patch, HIVE-15672.02.patch, 
> HIVE-15672.03.patch, HIVE-15672.04.patch
>
>
> 4) Send VRB to the pipeline and write ORC in parallel (in background).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848040#comment-15848040
 ] 

Hive QA commented on HIVE-15649:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850365/HIVE-15649.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11017 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3295/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3295/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3295/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850365 - PreCommit-HIVE-Build

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.03.patch, HIVE-15649.04.patch, HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15509) Add back the script + transform tests to minitez

2017-01-31 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15509:
-
Attachment: HIVE-15509.1.patch

Rebased patch. [~sseth] Can you please review this patch?

> Add back the script + transform tests to minitez
> 
>
> Key: HIVE-15509
> URL: https://issues.apache.org/jira/browse/HIVE-15509
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15509.1.patch, HIVE-15509.1.patch
>
>
> Script operator cannot run in minillap and so was removed from the minillap 
> test suite. But tez supports script + transform. Add the removed tests back 
> to minitez test suite. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15763) Subquery in both LHS and RHS of IN/NOT IN throws misleading error

2017-01-31 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15763:
---
Attachment: HIVE-15763.1.patch

> Subquery in both LHS and RHS of IN/NOT IN throws misleading error
> -
>
> Key: HIVE-15763
> URL: https://issues.apache.org/jira/browse/HIVE-15763
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15763.1.patch
>
>
> Following query throws an error
> {code}select * from part where (select max(p_size) from part) IN (select 
> p_size from part);{code}
> Error
> {noformat}
> SemanticException [Error 10249]: Line 1:79 Unsupported SubQuery Expression 
> 'p_size': Only 1 SubQuery expression is supported.
> {noformat}
> Such queries should either be supported or should be detected and an 
> appropriate error message should be thrown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15763) Subquery in both LHS and RHS of IN/NOT IN throws misleading error

2017-01-31 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15763:
---
Status: Patch Available  (was: Open)

> Subquery in both LHS and RHS of IN/NOT IN throws misleading error
> -
>
> Key: HIVE-15763
> URL: https://issues.apache.org/jira/browse/HIVE-15763
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
> Attachments: HIVE-15763.1.patch
>
>
> Following query throws an error
> {code}select * from part where (select max(p_size) from part) IN (select 
> p_size from part);{code}
> Error
> {noformat}
> SemanticException [Error 10249]: Line 1:79 Unsupported SubQuery Expression 
> 'p_size': Only 1 SubQuery expression is supported.
> {noformat}
> Such queries should either be supported or should be detected and an 
> appropriate error message should be thrown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15778) DROP INDEX (non-existent) throws NPE when using DbNotificationListener

2017-01-31 Thread Vamsee Yarlagadda (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsee Yarlagadda updated HIVE-15778:
-
Attachment: HIVE-15778.v0.patch

> DROP INDEX (non-existent) throws NPE when using DbNotificationListener 
> ---
>
> Key: HIVE-15778
> URL: https://issues.apache.org/jira/browse/HIVE-15778
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Vamsee Yarlagadda
>Assignee: Vamsee Yarlagadda
> Attachments: HIVE-15778.v0.patch
>
>
> Trying to execute a DROP INDEX operation on a non-existant index throws NPE.  
> {code}
> 0: jdbc:hive2://nightly-unsecure-1.gce.cloude> DROP INDEX IF EXISTS vamsee1 
> ON sample_07;
> INFO  : Compiling 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4): 
> DROP INDEX IF EXISTS vamsee1 ON sample_07
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4); 
> Time taken: 0.238 seconds
> INFO  : Executing 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4): 
> DROP INDEX IF EXISTS vamsee1 ON sample_07
> INFO  : Starting task [Stage-0:DDL] in serial mode
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException
> INFO  : Completed executing 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4); 
> Time taken: 0.061 seconds
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException 
> (state=08S01,code=1)
> {code}
> HMS log:
> {code}
> 2017-01-31 16:27:29,421 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-3]: 
> MetaException(message:java.lang.NullPointerException)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5823)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.rethrowException(HiveMetaStore.java:4892)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_index_by_name(HiveMetaStore.java:4403)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy16.drop_index_by_name(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_index_by_name.getResult(ThriftHiveMetastore.java:10803)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_index_by_name.getResult(ThriftHiveMetastore.java:10787)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hive.hcatalog.messaging.json.JSONDropIndexMessage.(JSONDropIndexMessage.java:46)
>   at 
> org.apache.hive.hcatalog.messaging.json.JSONMessageFactory.buildDropIndexMessage(JSONMessageFactory.java:159)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onDropIndex(DbNotificationListener.java:280)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_index_by_name_core(HiveMetaStore.java:4469)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_index_by_name(HiveMetaStore.java:4396)
>   ... 20 more
> 

[jira] [Updated] (HIVE-15778) DROP INDEX (non-existent) throws NPE when using DbNotificationListener

2017-01-31 Thread Vamsee Yarlagadda (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamsee Yarlagadda updated HIVE-15778:
-
Assignee: Vamsee Yarlagadda
  Status: Patch Available  (was: Open)

> DROP INDEX (non-existent) throws NPE when using DbNotificationListener 
> ---
>
> Key: HIVE-15778
> URL: https://issues.apache.org/jira/browse/HIVE-15778
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Vamsee Yarlagadda
>Assignee: Vamsee Yarlagadda
>
> Trying to execute a DROP INDEX operation on a non-existant index throws NPE.  
> {code}
> 0: jdbc:hive2://nightly-unsecure-1.gce.cloude> DROP INDEX IF EXISTS vamsee1 
> ON sample_07;
> INFO  : Compiling 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4): 
> DROP INDEX IF EXISTS vamsee1 ON sample_07
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4); 
> Time taken: 0.238 seconds
> INFO  : Executing 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4): 
> DROP INDEX IF EXISTS vamsee1 ON sample_07
> INFO  : Starting task [Stage-0:DDL] in serial mode
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException
> INFO  : Completed executing 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4); 
> Time taken: 0.061 seconds
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException 
> (state=08S01,code=1)
> {code}
> HMS log:
> {code}
> 2017-01-31 16:27:29,421 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-3]: 
> MetaException(message:java.lang.NullPointerException)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5823)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.rethrowException(HiveMetaStore.java:4892)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_index_by_name(HiveMetaStore.java:4403)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy16.drop_index_by_name(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_index_by_name.getResult(ThriftHiveMetastore.java:10803)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_index_by_name.getResult(ThriftHiveMetastore.java:10787)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hive.hcatalog.messaging.json.JSONDropIndexMessage.(JSONDropIndexMessage.java:46)
>   at 
> org.apache.hive.hcatalog.messaging.json.JSONMessageFactory.buildDropIndexMessage(JSONMessageFactory.java:159)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onDropIndex(DbNotificationListener.java:280)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_index_by_name_core(HiveMetaStore.java:4469)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_index_by_name(HiveMetaStore.java:4396)
>   ... 20 more
> {code}
> 

[jira] [Commented] (HIVE-15777) propagate LLAP app ID to ATS and log it

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848007#comment-15848007
 ] 

Hive QA commented on HIVE-15777:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850363/HIVE-15777.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 10453 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=17)

[list_bucket_dml_1.q,ppd_join3.q,auto_join23.q,list_bucket_dml_11.q,join10.q,avro_type_evolution.q,create_struct_table.q,skewjoin_mapjoin9.q,hook_context_cs.q,subquery_unqualcolumnrefs.q,exim_22_import_exist_authsuccess.q,groupby1.q,cbo_rp_udf_udaf.q,udf_regexp_replace.q,vector_decimal_aggregate.q,authorization_grant_public_role.q,create_skewed_table1.q,partition_wise_fileformat.q,sort_merge_join_desc_5.q,union_ppr.q,spark_combine_equivalent_work.q,stats_partial_size.q,partition_date2.q,join32.q,list_bucket_dml_14.q,input34.q,insert_values_acid_not_bucketed.q,udf_parse_url.q,schema_evol_text_nonvec_part.q,ctas_char.q]
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=134)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=135)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=136)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=137)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
 (batchId=138)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=139)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=140)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver
 (batchId=155)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testEscapedStrings (batchId=217)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=217)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testNonAsciiStrings (batchId=217)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3294/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3294/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3294/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed

[jira] [Commented] (HIVE-15778) DROP INDEX (non-existent) throws NPE when using DbNotificationListener

2017-01-31 Thread Vamsee Yarlagadda (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848003#comment-15848003
 ] 

Vamsee Yarlagadda commented on HIVE-15778:
--

Here is the [code block in 
HiveMetaStore.java|https://github.com/apache/hive/blob/4becd689d59ee3f75a36119fbb950c44e16c65df/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4617-L4620]
 which is calling the listeners.
{code}
for (MetaStoreEventListener listener : listeners) {
  DropIndexEvent dropIndexEvent = new DropIndexEvent(index, success, 
this);
  listener.onDropIndex(dropIndexEvent);
}
{code} 

Rather than making it a special case for DbNotificationListener to get skipped, 
this has a potential to raise NPE for any future listeners that get added. So I 
think if the *index* variable is NULL in the finally block, we could skip the 
entire processing of MetaStoreEventListeners?

Suggestion:
{code}
   if (index != null) {
for (MetaStoreEventListener listener : listeners) {
  DropIndexEvent dropIndexEvent = new DropIndexEvent(index, success, 
this);
  listener.onDropIndex(dropIndexEvent);
}
   }
{code} 

> DROP INDEX (non-existent) throws NPE when using DbNotificationListener 
> ---
>
> Key: HIVE-15778
> URL: https://issues.apache.org/jira/browse/HIVE-15778
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Vamsee Yarlagadda
>
> Trying to execute a DROP INDEX operation on a non-existant index throws NPE.  
> {code}
> 0: jdbc:hive2://nightly-unsecure-1.gce.cloude> DROP INDEX IF EXISTS vamsee1 
> ON sample_07;
> INFO  : Compiling 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4): 
> DROP INDEX IF EXISTS vamsee1 ON sample_07
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4); 
> Time taken: 0.238 seconds
> INFO  : Executing 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4): 
> DROP INDEX IF EXISTS vamsee1 ON sample_07
> INFO  : Starting task [Stage-0:DDL] in serial mode
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException
> INFO  : Completed executing 
> command(queryId=hive_20170131162727_663a0909-2a82-44f9-a800-f4a35abaeaa4); 
> Time taken: 0.061 seconds
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException 
> (state=08S01,code=1)
> {code}
> HMS log:
> {code}
> 2017-01-31 16:27:29,421 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-3]: 
> MetaException(message:java.lang.NullPointerException)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5823)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.rethrowException(HiveMetaStore.java:4892)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_index_by_name(HiveMetaStore.java:4403)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
>   at com.sun.proxy.$Proxy16.drop_index_by_name(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_index_by_name.getResult(ThriftHiveMetastore.java:10803)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_index_by_name.getResult(ThriftHiveMetastore.java:10787)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1796)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> 

[jira] [Updated] (HIVE-15763) Subquery in both LHS and RHS of IN/NOT IN throws misleading error

2017-01-31 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15763:
---
Summary: Subquery in both LHS and RHS of IN/NOT IN throws misleading error  
(was: Subquery in both lhs and rsh of IN/NOT IN throws misleading error)

> Subquery in both LHS and RHS of IN/NOT IN throws misleading error
> -
>
> Key: HIVE-15763
> URL: https://issues.apache.org/jira/browse/HIVE-15763
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>  Labels: sub-query
>
> Following query throws an error
> {code}select * from part where (select max(p_size) from part) IN (select 
> p_size from part);{code}
> Error
> {noformat}
> SemanticException [Error 10249]: Line 1:79 Unsupported SubQuery Expression 
> 'p_size': Only 1 SubQuery expression is supported.
> {noformat}
> Such queries should either be supported or should be detected and an 
> appropriate error message should be thrown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15736) Add unit tests to Utilities.getInputSummary() method for multi-threading cases

2017-01-31 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847941#comment-15847941
 ] 

Chaoyu Tang commented on HIVE-15736:


LGTM, +1

> Add unit tests to Utilities.getInputSummary() method for multi-threading cases
> --
>
> Key: HIVE-15736
> URL: https://issues.apache.org/jira/browse/HIVE-15736
> Project: Hive
>  Issue Type: Test
>  Components: Query Planning
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-15736.1.patch, HIVE-15736.2.patch, 
> HIVE-15736.3.patch
>
>
> The {{Utilities.getInputSummary}} method has a configuration to use multiple 
> threads to get the content summary of tables and partitions. This 
> configuration variable, {{mapred.dfsclient.parallelism.max}}, is disabled by 
> default and there are no tests that validate the quality of using multi 
> threads.
> This JIRA is used to add tests to such method with multiple threads and fix 
> any issue found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14420) Fix orc_llap_counters.q test failure in master

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847939#comment-15847939
 ] 

Hive QA commented on HIVE-14420:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850352/HIVE-14420.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10990 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=1)

[udf_upper.q,ctas_date.q,schema_evol_orc_acidvec_table_update.q,groupby_grouping_sets3.q,vector_decimal_5.q,bucket_map_join_spark4.q,timestamp_2.q,date_join1.q,constprog_type.q,timestamp_ints_casts.q,udf_negative.q,orc_merge_diff_fs.q,udf_substring_index.q,newline.q,diff_part_input_formats.q,auto_join_without_localtask.q,join46.q,ctas_uses_table_location.q,tez_bmj_schema_evolution.q,bucketmapjoin4.q,udf_context_aware.q,groupby2_noskew.q,authorization_non_id.q,sample_islocalmode_hook_hadoop20.q,auto_sortmerge_join_3.q,mapjoin_test_outer.q,vectorization_9.q,input15.q,groupby6_noskew.q,udf_PI.q]
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3293/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3293/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3293/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850352 - PreCommit-HIVE-Build

> Fix orc_llap_counters.q test failure in master
> --
>
> Key: HIVE-14420
> URL: https://issues.apache.org/jira/browse/HIVE-14420
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14420.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15672) LLAP text cache: improve first query perf II

2017-01-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15672:

Attachment: HIVE-15672.04.patch

A probable fix for inconsistent results returned if eviction happens sometimes, 
found by [~gopalv]

> LLAP text cache: improve first query perf II
> 
>
> Key: HIVE-15672
> URL: https://issues.apache.org/jira/browse/HIVE-15672
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15672.01.patch, HIVE-15672.02.patch, 
> HIVE-15672.03.patch, HIVE-15672.04.patch
>
>
> 4) Send VRB to the pipeline and write ORC in parallel (in background).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15517) NOT (x <=> y) returns NULL if x or y is NULL

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847908#comment-15847908
 ] 

Hive QA commented on HIVE-15517:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850345/HIVE-15517.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10993 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=13)

[avro_joins.q,serde_reported_schema.q,annotate_stats_join_pkfk.q,udf_unix_timestamp.q,union22.q,describe_comment_nonascii.q,orc_analyze.q,schema_evol_orc_acidvec_part_update.q,partition_date.q,stats15.q,tez_join_result_complex.q,input36.q,alter_numbuckets_partitioned_table2_h23.q,transform_ppr1.q,spark_vectorized_dynamic_partition_pruning.q,unionDistinct_2.q,udaf_histogram_numeric.q,authorization_index.q,auto_join26.q,list_bucket_dml_3.q,alter_table_partition_drop.q,cbo_stats.q,vector_count.q,decimal_trailing.q,parquet_types_vectorization.q,smb_mapjoin_22.q,vector_decimal_6.q,autoColumnStats_8.q,input5.q,sample1.q]
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3292/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3292/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3292/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850345 - PreCommit-HIVE-Build

> NOT (x <=> y) returns NULL if x or y is NULL
> 
>
> Key: HIVE-15517
> URL: https://issues.apache.org/jira/browse/HIVE-15517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor, SQL
>Affects Versions: 1.2.1
>Reporter: Alexey Bedrintsev
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15517.01.patch
>
>
> I created a table as following:
> create table test(x string, y string);
> insert into test values ('q', 'q'), ('q', 'w'), (NULL, 'q'), ('q', NULL), 
> (NULL, NULL);
> Then I try to compare values taking NULLs into account:
> select *, x<=>y, not (x<=> y), (x <=> y) = false from test;
> OK
> q   q   truefalse   false
> q   w   false   truetrue
> q   NULLfalse   NULLtrue
> NULLq   false   NULLtrue
> NULLNULLtrueNULLfalse
> I expected that 4th column will be the same as 5th one but actually got NULL 
> as result of "not false" and "not true" expressions.
> Hive 1.2.1000.2.5.0.0-1245
> Subversion 
> git://c66-slave-20176e25-3/grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive
>  -r da6c690d384d1666f5a5f450be5cbc54e2fe4bd6
> Compiled by jenkins on Fri Aug 26 01:39:52 UTC 2016
> From source with checksum c30648316a632f7a753f4359e5c8f4d6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Status: Patch Available  (was: Reopened)

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.03.patch, HIVE-15649.04.patch, HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-31 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847893#comment-15847893
 ] 

Prasanth Jayachandran commented on HIVE-15649:
--

+1

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.03.patch, HIVE-15649.04.patch, HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15680) Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847891#comment-15847891
 ] 

Sergey Shelukhin edited comment on HIVE-15680 at 2/1/17 2:26 AM:
-

Submitted an addendum patch .04 there. You can try your patch with that patch 
to see if that makes tests pass here. Not sure why it triggers that path 
though... hopefully it doesn't somehow break projection. Although in this case 
it's a text table.


was (Author: sershe):
Submitted a patch there. You can try your patch with that patch to see if that 
makes tests pass here. Not sure why it triggers that path though... hopefully 
it doesn't somehow break projection. Although in this case it's a text table.

> Incorrect results when hive.optimize.index.filter=true and same ORC table is 
> referenced twice in query
> --
>
> Key: HIVE-15680
> URL: https://issues.apache.org/jira/browse/HIVE-15680
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.2.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-15680.1.patch, HIVE-15680.2.patch, 
> HIVE-15680.3.patch, HIVE-15680.4.patch, HIVE-15680.5.patch, HIVE-15680.6.patch
>
>
> To repro:
> {noformat}
> set hive.optimize.index.filter=true;
> create table test_table(number int) stored as ORC;
> -- Two insertions will create two files, with one stripe each
> insert into table test_table VALUES (1);
> insert into table test_table VALUES (2);
> -- This should and does return 2 records
> select * from test_table;
> -- These should and do each return 1 record
> select * from test_table where number = 1;
> select * from test_table where number = 2;
> -- This should return 2 records but only returns 1 record
> select * from test_table where number = 1
> union all
> select * from test_table where number = 2;
> {noformat}
> What's happening is only the last predicate is being pushed down.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847890#comment-15847890
 ] 

Sergey Shelukhin edited comment on HIVE-15649 at 2/1/17 2:26 AM:
-

An addendum patch that should take care of the problem at the root (.04).
[~prasanth_j] can you take a look?


was (Author: sershe):
An addendum patch that should take care of the problem at the root.
[~prasanth_j] can you take a look?

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.03.patch, HIVE-15649.04.patch, HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15680) Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847891#comment-15847891
 ] 

Sergey Shelukhin commented on HIVE-15680:
-

Submitted a patch there. You can try your patch with that patch to see if that 
makes tests pass here. Not sure why it triggers that path though... hopefully 
it doesn't somehow break projection. Although in this case it's a text table.

> Incorrect results when hive.optimize.index.filter=true and same ORC table is 
> referenced twice in query
> --
>
> Key: HIVE-15680
> URL: https://issues.apache.org/jira/browse/HIVE-15680
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.2.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-15680.1.patch, HIVE-15680.2.patch, 
> HIVE-15680.3.patch, HIVE-15680.4.patch, HIVE-15680.5.patch, HIVE-15680.6.patch
>
>
> To repro:
> {noformat}
> set hive.optimize.index.filter=true;
> create table test_table(number int) stored as ORC;
> -- Two insertions will create two files, with one stripe each
> insert into table test_table VALUES (1);
> insert into table test_table VALUES (2);
> -- This should and does return 2 records
> select * from test_table;
> -- These should and do each return 1 record
> select * from test_table where number = 1;
> select * from test_table where number = 2;
> -- This should return 2 records but only returns 1 record
> select * from test_table where number = 1
> union all
> select * from test_table where number = 2;
> {noformat}
> What's happening is only the last predicate is being pushed down.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15649:

Attachment: HIVE-15649.04.patch

An addendum patch that should take care of the problem at the root.
[~prasanth_j] can you take a look?

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.03.patch, HIVE-15649.04.patch, HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (HIVE-15649) LLAP IO may NPE on all-column read

2017-01-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reopened HIVE-15649:
-

Still happens because columnIds can still be null.

> LLAP IO may NPE on all-column read
> --
>
> Key: HIVE-15649
> URL: https://issues.apache.org/jira/browse/HIVE-15649
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15649.01.patch, HIVE-15649.02.patch, 
> HIVE-15649.03.patch, HIVE-15649.patch
>
>
> It seems like very few paths use READ_ALL_COLUMNS config, but some do. LLAP 
> IO doesn't account for that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15777) propagate LLAP app ID to ATS and log it

2017-01-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15777:

Attachment: HIVE-15777.patch

A patch, still need to test it on the cluster.
[~jdere] can you take a look?

> propagate LLAP app ID to ATS and log it 
> 
>
> Key: HIVE-15777
> URL: https://issues.apache.org/jira/browse/HIVE-15777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15777.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15777) propagate LLAP app ID to ATS and log it

2017-01-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15777:

Status: Patch Available  (was: Open)

> propagate LLAP app ID to ATS and log it 
> 
>
> Key: HIVE-15777
> URL: https://issues.apache.org/jira/browse/HIVE-15777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15777.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15772) set the exception into SparkJobStatus if exception happened in RemoteSparkJobMonitor and LocalSparkJobMonitor

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847881#comment-15847881
 ] 

Hive QA commented on HIVE-15772:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850341/HIVE-15772.000.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11017 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3291/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3291/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3291/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850341 - PreCommit-HIVE-Build

> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor
> -
>
> Key: HIVE-15772
> URL: https://issues.apache.org/jira/browse/HIVE-15772
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-15772.000.patch
>
>
> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function setError in 
> SparkJobStatus. So the exception can be got by SparkTask.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15777) propagate LLAP app ID to ATS and log it

2017-01-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-15777:
---


> propagate LLAP app ID to ATS and log it 
> 
>
> Key: HIVE-15777
> URL: https://issues.apache.org/jira/browse/HIVE-15777
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15680) Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847877#comment-15847877
 ] 

Sergey Shelukhin commented on HIVE-15680:
-

The issue at least in some tests seems to be the continuation of HIVE-15649 
that I've fixed recently. I will fix this shortly, probably today.

> Incorrect results when hive.optimize.index.filter=true and same ORC table is 
> referenced twice in query
> --
>
> Key: HIVE-15680
> URL: https://issues.apache.org/jira/browse/HIVE-15680
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.2.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-15680.1.patch, HIVE-15680.2.patch, 
> HIVE-15680.3.patch, HIVE-15680.4.patch, HIVE-15680.5.patch, HIVE-15680.6.patch
>
>
> To repro:
> {noformat}
> set hive.optimize.index.filter=true;
> create table test_table(number int) stored as ORC;
> -- Two insertions will create two files, with one stripe each
> insert into table test_table VALUES (1);
> insert into table test_table VALUES (2);
> -- This should and does return 2 records
> select * from test_table;
> -- These should and do each return 1 record
> select * from test_table where number = 1;
> select * from test_table where number = 2;
> -- This should return 2 records but only returns 1 record
> select * from test_table where number = 1
> union all
> select * from test_table where number = 2;
> {noformat}
> What's happening is only the last predicate is being pushed down.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15703) HiveSubQRemoveRelBuilder should use Hive's own factories

2017-01-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15703:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Vineet!

> HiveSubQRemoveRelBuilder should use Hive's own factories
> 
>
> Key: HIVE-15703
> URL: https://issues.apache.org/jira/browse/HIVE-15703
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Vineet Garg
> Fix For: 2.2.0
>
> Attachments: HIVE-15703.01.patch, HIVE-15703.2.patch, 
> HIVE-15703.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15517) NOT (x <=> y) returns NULL if x or y is NULL

2017-01-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847870#comment-15847870
 ] 

Ashutosh Chauhan commented on HIVE-15517:
-

+1

> NOT (x <=> y) returns NULL if x or y is NULL
> 
>
> Key: HIVE-15517
> URL: https://issues.apache.org/jira/browse/HIVE-15517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor, SQL
>Affects Versions: 1.2.1
>Reporter: Alexey Bedrintsev
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15517.01.patch
>
>
> I created a table as following:
> create table test(x string, y string);
> insert into test values ('q', 'q'), ('q', 'w'), (NULL, 'q'), ('q', NULL), 
> (NULL, NULL);
> Then I try to compare values taking NULLs into account:
> select *, x<=>y, not (x<=> y), (x <=> y) = false from test;
> OK
> q   q   truefalse   false
> q   w   false   truetrue
> q   NULLfalse   NULLtrue
> NULLq   false   NULLtrue
> NULLNULLtrueNULLfalse
> I expected that 4th column will be the same as 5th one but actually got NULL 
> as result of "not false" and "not true" expressions.
> Hive 1.2.1000.2.5.0.0-1245
> Subversion 
> git://c66-slave-20176e25-3/grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive
>  -r da6c690d384d1666f5a5f450be5cbc54e2fe4bd6
> Compiled by jenkins on Fri Aug 26 01:39:52 UTC 2016
> From source with checksum c30648316a632f7a753f4359e5c8f4d6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15680) Incorrect results when hive.optimize.index.filter=true and same ORC table is referenced twice in query

2017-01-31 Thread Anthony Hsu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847866#comment-15847866
 ] 

Anthony Hsu commented on HIVE-15680:


[~gopalv], [~sershe], [~xuefuz]: Is it possible to run the LLAP tests all in 
one process, so you can step through the code easily? If so, could you provide 
some pointers?

> Incorrect results when hive.optimize.index.filter=true and same ORC table is 
> referenced twice in query
> --
>
> Key: HIVE-15680
> URL: https://issues.apache.org/jira/browse/HIVE-15680
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.2.0
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Attachments: HIVE-15680.1.patch, HIVE-15680.2.patch, 
> HIVE-15680.3.patch, HIVE-15680.4.patch, HIVE-15680.5.patch, HIVE-15680.6.patch
>
>
> To repro:
> {noformat}
> set hive.optimize.index.filter=true;
> create table test_table(number int) stored as ORC;
> -- Two insertions will create two files, with one stripe each
> insert into table test_table VALUES (1);
> insert into table test_table VALUES (2);
> -- This should and does return 2 records
> select * from test_table;
> -- These should and do each return 1 record
> select * from test_table where number = 1;
> select * from test_table where number = 2;
> -- This should return 2 records but only returns 1 record
> select * from test_table where number = 1
> union all
> select * from test_table where number = 2;
> {noformat}
> What's happening is only the last predicate is being pushed down.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15703) HiveSubQRemoveRelBuilder should use Hive's own factories

2017-01-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847862#comment-15847862
 ] 

Ashutosh Chauhan commented on HIVE-15703:
-

+1

> HiveSubQRemoveRelBuilder should use Hive's own factories
> 
>
> Key: HIVE-15703
> URL: https://issues.apache.org/jira/browse/HIVE-15703
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Vineet Garg
> Attachments: HIVE-15703.01.patch, HIVE-15703.2.patch, 
> HIVE-15703.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14420) Fix orc_llap_counters.q test failure in master

2017-01-31 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847860#comment-15847860
 ] 

Prasanth Jayachandran commented on HIVE-14420:
--

[~sseth] Can you please take a look at this one? This is essentially same 
changes as HIVE-14936

> Fix orc_llap_counters.q test failure in master
> --
>
> Key: HIVE-14420
> URL: https://issues.apache.org/jira/browse/HIVE-14420
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14420.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14420) Fix orc_llap_counters.q test failure in master

2017-01-31 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14420:
-
Attachment: HIVE-14420.1.patch

This is similar to HIVE-14936 change to remove some flakiness in the counters.

> Fix orc_llap_counters.q test failure in master
> --
>
> Key: HIVE-14420
> URL: https://issues.apache.org/jira/browse/HIVE-14420
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.2.0, 2.1.1
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14420.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-14420) Fix orc_llap_counters.q test failure in master

2017-01-31 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14420:
-
Status: Patch Available  (was: Reopened)

> Fix orc_llap_counters.q test failure in master
> --
>
> Key: HIVE-14420
> URL: https://issues.apache.org/jira/browse/HIVE-14420
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 2.1.1, 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14420.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15748) Remove cycles created due to semi join branch and map join Op on same operator pipeline

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847853#comment-15847853
 ] 

Hive QA commented on HIVE-15748:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850337/HIVE-15748.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11015 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3290/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3290/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3290/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850337 - PreCommit-HIVE-Build

> Remove cycles created due to semi join branch and map join Op on same 
> operator pipeline
> ---
>
> Key: HIVE-15748
> URL: https://issues.apache.org/jira/browse/HIVE-15748
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15748.1.patch, HIVE-15748.2.patch, 
> HIVE-15748.3.patch
>
>
> If a semi join branch and map join operator are on same operator pipeline, 
> then there could be a cycle created. Where the other map feeding into the 
> mapjoin operator is waiting for the semi join branch to finish causing a 
> cycle.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15752) MSCK should add output WriteEntity for table in semantic analysis

2017-01-31 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15752:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Patch committed to master.
Thanks for review [~sushanth]

> MSCK should add output WriteEntity for table in semantic analysis
> -
>
> Key: HIVE-15752
> URL: https://issues.apache.org/jira/browse/HIVE-15752
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.2.1, 2.0.0, 2.1.0, 2.1.1
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 2.2.0
>
> Attachments: HIVE-15752.1.patch, HIVE-15752.2.patch
>
>
> MSCK should add table WriteEntity in query to outputs WriteEntity list of the 
> query



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15753) subquery failing with org.apache.hadoop.hive.ql.parse.SemanticException

2017-01-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-15753:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Vineet!

> subquery failing with  org.apache.hadoop.hive.ql.parse.SemanticException
> 
>
> Key: HIVE-15753
> URL: https://issues.apache.org/jira/browse/HIVE-15753
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Vineet Garg
>  Labels: sub-query
> Fix For: 2.2.0
>
> Attachments: HIVE-15753.1.patch, HIVE-15753.2.patch
>
>
> Simple reproducer
> -
> * Create table {{part}} using {{q_test_init.sql}}
> * Run the following query
> {code}
> explain SELECT p1.p_name FROM part p1 LEFT JOIN (select p_type as p_col from 
> part ) p2 WHERE NOT EXISTS
> +(select pp1.p_type as p_col from part pp1 where 
> pp1.p_partkey = p2.p_col);
> {code}
> -
> Following query is failing with SemanticException
> Query:
> SELECT DISTINCT t1.smallint_col_11 FROM table_21 t1 LEFT JOIN (   
>   SELECT smallint_col_45, (-224) - (COALESCE(MIN(665) OVER (ORDER BY 
> smallint_col_45 DESC, varchar0170_col_23 DESC), NULL, -631)) AS int_col, 
> AVG((GREATEST(CAST(806 AS int), CAST(-606 AS int))) * (39)) OVER (PARTITION 
> BY smallint_col_45 ORDER BY smallint_col_45 DESC, varchar0170_col_23 ASC ROWS 
> BETWEEN 24 FOLLOWING AND UNBOUNDED FOLLOWING) AS float_col, COALESCE(338, 
> (965) + (-335), MAX(544) OVER (PARTITION BY varchar0170_col_23)) AS 
> int_col_1, varchar0170_col_23 FROM table_20 ) t2 ON 
> (((t2.int_col_1) = (t1.smallint_col_3)) AND ((t2.smallint_col_45) = 
> (t1.smallint_col_11))) AND ((t2.smallint_col_45) = (t1.smallint_col_11)) 
> WHERE NOT EXISTS (SELECT COALESCE(tt1.smallint_col_11, 
> tt2.smallint_col_3, tt1.smallint_col_11) AS int_col FROM table_21 tt1 
> INNER JOIN table_21 tt2 ON (tt2.smallint_col_11) = (tt1.smallint_col_3) 
> WHERE ((tt2.smallint_col_11) >= (tt1.smallint_col_3)) AND 
> ((t2.int_col) = (tt2.smallint_col_3)))



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15775) LlapRegistryService client cache is not ideal

2017-01-31 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847801#comment-15847801
 ] 

Gopal V commented on HIVE-15775:


[~sershe]:  for #1

{code}
this.userPathPrefix = USER_SCOPE_PATH_PREFIX + getZkPathUser(this.conf);
this.workersPath =  "/" + userPathPrefix + "/" + instanceName + "/workers";
{code}

> LlapRegistryService client cache is not ideal
> -
>
> Key: HIVE-15775
> URL: https://issues.apache.org/jira/browse/HIVE-15775
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> 1) Needs to account for user name, in case there are multiple clusters from 
> multiple users with the same name.
> 2) Probably needs some expiration policy.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15776) Flaky test: TestMiniLlapLocalCliDriver vector_if_expr

2017-01-31 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847790#comment-15847790
 ] 

Thejas M Nair commented on HIVE-15776:
--

hive.log had -

{code}
2017-01-30T23:40:38,754 ERROR [TezTaskRunner] tez.ReduceRecordSource: 
java.lang.AssertionError
at 
org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setBatchSize(VectorizedBatchUtil.java:125)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:448)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:388)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:239)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:319)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

> Flaky test: TestMiniLlapLocalCliDriver vector_if_expr
> -
>
> Key: HIVE-15776
> URL: https://issues.apache.org/jira/browse/HIVE-15776
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Thejas M Nair
>Priority: Critical
>
> Failed in https://builds.apache.org/job/PreCommit-HIVE-Build/3274/ with 
> following error in test log -
> java.lang.AssertionError: 
> Unexpected exception java.lang.AssertionError: Client execution failed with 
> error code = 2 running 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15703) HiveSubQRemoveRelBuilder should use Hive's own factories

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847783#comment-15847783
 ] 

Hive QA commented on HIVE-15703:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850333/HIVE-15703.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11015 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables] 
(batchId=78)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_varchar_simple]
 (batchId=153)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3289/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3289/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3289/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850333 - PreCommit-HIVE-Build

> HiveSubQRemoveRelBuilder should use Hive's own factories
> 
>
> Key: HIVE-15703
> URL: https://issues.apache.org/jira/browse/HIVE-15703
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Vineet Garg
> Attachments: HIVE-15703.01.patch, HIVE-15703.2.patch, 
> HIVE-15703.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15517) NOT (x <=> y) returns NULL if x or y is NULL

2017-01-31 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847775#comment-15847775
 ] 

Pengcheng Xiong commented on HIVE-15517:


[~ashutoshc],could u please review it? Thanks.

> NOT (x <=> y) returns NULL if x or y is NULL
> 
>
> Key: HIVE-15517
> URL: https://issues.apache.org/jira/browse/HIVE-15517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor, SQL
>Affects Versions: 1.2.1
>Reporter: Alexey Bedrintsev
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15517.01.patch
>
>
> I created a table as following:
> create table test(x string, y string);
> insert into test values ('q', 'q'), ('q', 'w'), (NULL, 'q'), ('q', NULL), 
> (NULL, NULL);
> Then I try to compare values taking NULLs into account:
> select *, x<=>y, not (x<=> y), (x <=> y) = false from test;
> OK
> q   q   truefalse   false
> q   w   false   truetrue
> q   NULLfalse   NULLtrue
> NULLq   false   NULLtrue
> NULLNULLtrueNULLfalse
> I expected that 4th column will be the same as 5th one but actually got NULL 
> as result of "not false" and "not true" expressions.
> Hive 1.2.1000.2.5.0.0-1245
> Subversion 
> git://c66-slave-20176e25-3/grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive
>  -r da6c690d384d1666f5a5f450be5cbc54e2fe4bd6
> Compiled by jenkins on Fri Aug 26 01:39:52 UTC 2016
> From source with checksum c30648316a632f7a753f4359e5c8f4d6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15775) LlapRegistryService client cache is not ideal

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847774#comment-15847774
 ] 

Sergey Shelukhin commented on HIVE-15775:
-

cc [~prasanth_j] 

> LlapRegistryService client cache is not ideal
> -
>
> Key: HIVE-15775
> URL: https://issues.apache.org/jira/browse/HIVE-15775
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> 1) Needs to account for user name, in case there are multiple clusters from 
> multiple users with the same name.
> 2) Probably needs some expiration policy.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15517) NOT (x <=> y) returns NULL if x or y is NULL

2017-01-31 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15517:
---
Status: Patch Available  (was: Open)

> NOT (x <=> y) returns NULL if x or y is NULL
> 
>
> Key: HIVE-15517
> URL: https://issues.apache.org/jira/browse/HIVE-15517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor, SQL
>Affects Versions: 1.2.1
>Reporter: Alexey Bedrintsev
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15517.01.patch
>
>
> I created a table as following:
> create table test(x string, y string);
> insert into test values ('q', 'q'), ('q', 'w'), (NULL, 'q'), ('q', NULL), 
> (NULL, NULL);
> Then I try to compare values taking NULLs into account:
> select *, x<=>y, not (x<=> y), (x <=> y) = false from test;
> OK
> q   q   truefalse   false
> q   w   false   truetrue
> q   NULLfalse   NULLtrue
> NULLq   false   NULLtrue
> NULLNULLtrueNULLfalse
> I expected that 4th column will be the same as 5th one but actually got NULL 
> as result of "not false" and "not true" expressions.
> Hive 1.2.1000.2.5.0.0-1245
> Subversion 
> git://c66-slave-20176e25-3/grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive
>  -r da6c690d384d1666f5a5f450be5cbc54e2fe4bd6
> Compiled by jenkins on Fri Aug 26 01:39:52 UTC 2016
> From source with checksum c30648316a632f7a753f4359e5c8f4d6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15517) NOT (x <=> y) returns NULL if x or y is NULL

2017-01-31 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15517:
---
Attachment: HIVE-15517.01.patch

> NOT (x <=> y) returns NULL if x or y is NULL
> 
>
> Key: HIVE-15517
> URL: https://issues.apache.org/jira/browse/HIVE-15517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor, SQL
>Affects Versions: 1.2.1
>Reporter: Alexey Bedrintsev
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15517.01.patch
>
>
> I created a table as following:
> create table test(x string, y string);
> insert into test values ('q', 'q'), ('q', 'w'), (NULL, 'q'), ('q', NULL), 
> (NULL, NULL);
> Then I try to compare values taking NULLs into account:
> select *, x<=>y, not (x<=> y), (x <=> y) = false from test;
> OK
> q   q   truefalse   false
> q   w   false   truetrue
> q   NULLfalse   NULLtrue
> NULLq   false   NULLtrue
> NULLNULLtrueNULLfalse
> I expected that 4th column will be the same as 5th one but actually got NULL 
> as result of "not false" and "not true" expressions.
> Hive 1.2.1000.2.5.0.0-1245
> Subversion 
> git://c66-slave-20176e25-3/grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive
>  -r da6c690d384d1666f5a5f450be5cbc54e2fe4bd6
> Compiled by jenkins on Fri Aug 26 01:39:52 UTC 2016
> From source with checksum c30648316a632f7a753f4359e5c8f4d6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15748) Remove cycles created due to semi join branch and map join Op on same operator pipeline

2017-01-31 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847769#comment-15847769
 ] 

Jason Dere commented on HIVE-15748:
---

+1

> Remove cycles created due to semi join branch and map join Op on same 
> operator pipeline
> ---
>
> Key: HIVE-15748
> URL: https://issues.apache.org/jira/browse/HIVE-15748
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15748.1.patch, HIVE-15748.2.patch, 
> HIVE-15748.3.patch
>
>
> If a semi join branch and map join operator are on same operator pipeline, 
> then there could be a cycle created. Where the other map feeding into the 
> mapjoin operator is waiting for the semi join branch to finish causing a 
> cycle.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15774) Ensure DbLockManager backward compatibility for non-ACID resources

2017-01-31 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15774:
-
Issue Type: Improvement  (was: Bug)

> Ensure DbLockManager backward compatibility for non-ACID resources
> --
>
> Key: HIVE-15774
> URL: https://issues.apache.org/jira/browse/HIVE-15774
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> In pre-ACID days, users perform operations such as INSERT with either 
> ZooKeeperHiveLockManager or no lock manager at all. If their workflow is 
> designed to take advantage of no locking and they take care of the control of 
> concurrency, this works well with good performance.
> With ACID, if users enable transactions (i.e. using DbTxnManager & 
> DbLockManager), then for all the operations, different types of locks will be 
> acquired accordingly by DbLockManager, even for non-ACID resources. This may 
> impact the performance of some workflows designed for pre-ACID use cases.
> A viable solution would be to differentiate the locking mode for ACID and 
> non-ACID resources, so that DbLockManager will continue its current behavior 
> for ACID tables, but will be able to acquire a less strict lock type for 
> non-ACID resources, thus avoiding the performance loss for those workflows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15774) Ensure DbLockManager backward compatibility for non-ACID resources

2017-01-31 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-15774:
-
Description: 
In pre-ACID days, users perform operations such as INSERT with either 
ZooKeeperHiveLockManager or no lock manager at all. If their workflow is 
designed to take advantage of no locking and they take care of the control of 
concurrency, this works well with good performance.

With ACID, if users enable transactions (i.e. using DbTxnManager & 
DbLockManager), then for all the operations, different types of locks will be 
acquired accordingly by DbLockManager, even for non-ACID resources. This may 
impact the performance of some workflows designed for pre-ACID use cases.

A viable solution would be to differentiate the locking mode for ACID and 
non-ACID resources, so that DbLockManager will continue its current behavior 
for ACID tables, but will be able to acquire a less strict lock type for 
non-ACID resources, thus avoiding the performance loss for those workflows.

  was:
In pre-ACID days, users perform operations such as INSERT with either 
ZooKeeperHiveLockManager or no lock manager at all. If their workflow is 
designed to take advantage of no locking and they take care of the control of 
concurrency, this works well with good performance.
With ACID, if users enable transactions (i.e. using DbTxnManager & 
DbLockManager), then for all the operations, different types of locks will be 
acquired accordingly by DbLockManager, even for non-ACID resources. This may 
impact the performance of some workflows designed for pre-ACID use cases.
A viable solution would be to differentiate the locking mode for ACID and 
non-ACID resources, so that DbLockManager will continue its current behavior 
for ACID tables, but will be able to acquire a less strict lock type for 
non-ACID resources, thus avoiding the performance loss for those workflows.


> Ensure DbLockManager backward compatibility for non-ACID resources
> --
>
> Key: HIVE-15774
> URL: https://issues.apache.org/jira/browse/HIVE-15774
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> In pre-ACID days, users perform operations such as INSERT with either 
> ZooKeeperHiveLockManager or no lock manager at all. If their workflow is 
> designed to take advantage of no locking and they take care of the control of 
> concurrency, this works well with good performance.
> With ACID, if users enable transactions (i.e. using DbTxnManager & 
> DbLockManager), then for all the operations, different types of locks will be 
> acquired accordingly by DbLockManager, even for non-ACID resources. This may 
> impact the performance of some workflows designed for pre-ACID use cases.
> A viable solution would be to differentiate the locking mode for ACID and 
> non-ACID resources, so that DbLockManager will continue its current behavior 
> for ACID tables, but will be able to acquire a less strict lock type for 
> non-ACID resources, thus avoiding the performance loss for those workflows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15774) Ensure DbLockManager backward compatibility for non-ACID resources

2017-01-31 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng reassigned HIVE-15774:



> Ensure DbLockManager backward compatibility for non-ACID resources
> --
>
> Key: HIVE-15774
> URL: https://issues.apache.org/jira/browse/HIVE-15774
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> In pre-ACID days, users perform operations such as INSERT with either 
> ZooKeeperHiveLockManager or no lock manager at all. If their workflow is 
> designed to take advantage of no locking and they take care of the control of 
> concurrency, this works well with good performance.
> With ACID, if users enable transactions (i.e. using DbTxnManager & 
> DbLockManager), then for all the operations, different types of locks will be 
> acquired accordingly by DbLockManager, even for non-ACID resources. This may 
> impact the performance of some workflows designed for pre-ACID use cases.
> A viable solution would be to differentiate the locking mode for ACID and 
> non-ACID resources, so that DbLockManager will continue its current behavior 
> for ACID tables, but will be able to acquire a less strict lock type for 
> non-ACID resources, thus avoiding the performance loss for those workflows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15772) set the exception into SparkJobStatus if exception happened in RemoteSparkJobMonitor and LocalSparkJobMonitor

2017-01-31 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-15772:
-
Description: set the exception into SparkJobStatus if exception happened in 
RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function setError in 
SparkJobStatus. So the exception can be got by SparkTask.  (was: set the 
exception into SparkJobStatus if exception happened in RemoteSparkJobMonitor 
and LocalSparkJobMonitor. Add function setError in SparkJobStatus.)

> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor
> -
>
> Key: HIVE-15772
> URL: https://issues.apache.org/jira/browse/HIVE-15772
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-15772.000.patch
>
>
> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function setError in 
> SparkJobStatus. So the exception can be got by SparkTask.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14086) org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro schema file

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847746#comment-15847746
 ] 

Sergey Shelukhin commented on HIVE-14086:
-

Note that since 2.0 (HIVE-11985) the column names for serdes with external 
schemas are generally not stored in metastore anymore.

> org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro 
> schema file
> 
>
> Key: HIVE-14086
> URL: https://issues.apache.org/jira/browse/HIVE-14086
> Project: Hive
>  Issue Type: Bug
>  Components: API
>Reporter: Lars Volker
> Attachments: avro.json, avroremoved.json, avro.sql
>
>
> Consider this table, using an external Avro schema file:
> {noformat}
> CREATE TABLE avro_table
>   PARTITIONED BY (str_part STRING)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   TBLPROPERTIES (
> 'avro.schema.url'='hdfs://localhost:20500/tmp/avro.json'
>   );
> {noformat}
> This will populate the "COLUMNS_V2" metastore table with the correct column 
> information (as per HIVE-6308). The columns of this table can then be queried 
> via the Hive API, for example by calling {{.getSd().getCols()}} on a 
> {{org.apache.hadoop.hive.metastore.api.Table}} object.
> Changes to the avro.schema.url file - either changing where it points to or 
> changing its contents - will be reflected in the output of {{describe 
> formatted avro_table}} *but not* in the result of the {{.getSd().getCols()}} 
> API call. Instead it looks like Hive only reads the Avro schema file 
> internally, but does not expose the information therein via its API.
> Is there a way to obtain the effective Table information via Hive? Would it 
> make sense to fix table retrieval so calls to {{get_table}} return the 
> correct set of columns?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15772) set the exception into SparkJobStatus if exception happened in RemoteSparkJobMonitor and LocalSparkJobMonitor

2017-01-31 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-15772:
-
Environment: (was: set the exception into SparkJobStatus if exception 
happened in RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function 
setError in SparkJobStatus.)
Description: set the exception into SparkJobStatus if exception happened in 
RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function setError in 
SparkJobStatus.

> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor
> -
>
> Key: HIVE-15772
> URL: https://issues.apache.org/jira/browse/HIVE-15772
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-15772.000.patch
>
>
> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function setError in 
> SparkJobStatus.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15772) set the exception into SparkJobStatus if exception happened in RemoteSparkJobMonitor and LocalSparkJobMonitor

2017-01-31 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-15772:
-
Status: Patch Available  (was: Open)

> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor
> -
>
> Key: HIVE-15772
> URL: https://issues.apache.org/jira/browse/HIVE-15772
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
> Environment: set the exception into SparkJobStatus if exception 
> happened in RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function 
> setError in SparkJobStatus.
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-15772.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15772) set the exception into SparkJobStatus if exception happened in RemoteSparkJobMonitor and LocalSparkJobMonitor

2017-01-31 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-15772:
-
Attachment: HIVE-15772.000.patch

> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor
> -
>
> Key: HIVE-15772
> URL: https://issues.apache.org/jira/browse/HIVE-15772
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
> Environment: set the exception into SparkJobStatus if exception 
> happened in RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function 
> setError in SparkJobStatus.
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-15772.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14086) org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro schema file

2017-01-31 Thread Lars Volker (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847742#comment-15847742
 ] 

Lars Volker commented on HIVE-14086:


[~spena] - Thanks for the update. Has this been added to Hive recently? Can you 
point me to a commit that adds the feature?

> org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro 
> schema file
> 
>
> Key: HIVE-14086
> URL: https://issues.apache.org/jira/browse/HIVE-14086
> Project: Hive
>  Issue Type: Bug
>  Components: API
>Reporter: Lars Volker
> Attachments: avro.json, avroremoved.json, avro.sql
>
>
> Consider this table, using an external Avro schema file:
> {noformat}
> CREATE TABLE avro_table
>   PARTITIONED BY (str_part STRING)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   TBLPROPERTIES (
> 'avro.schema.url'='hdfs://localhost:20500/tmp/avro.json'
>   );
> {noformat}
> This will populate the "COLUMNS_V2" metastore table with the correct column 
> information (as per HIVE-6308). The columns of this table can then be queried 
> via the Hive API, for example by calling {{.getSd().getCols()}} on a 
> {{org.apache.hadoop.hive.metastore.api.Table}} object.
> Changes to the avro.schema.url file - either changing where it points to or 
> changing its contents - will be reflected in the output of {{describe 
> formatted avro_table}} *but not* in the result of the {{.getSd().getCols()}} 
> API call. Instead it looks like Hive only reads the Avro schema file 
> internally, but does not expose the information therein via its API.
> Is there a way to obtain the effective Table information via Hive? Would it 
> make sense to fix table retrieval so calls to {{get_table}} return the 
> correct set of columns?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15772) set the exception into SparkJobStatus if exception happened in RemoteSparkJobMonitor and LocalSparkJobMonitor

2017-01-31 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu reassigned HIVE-15772:



> set the exception into SparkJobStatus if exception happened in 
> RemoteSparkJobMonitor and LocalSparkJobMonitor
> -
>
> Key: HIVE-15772
> URL: https://issues.apache.org/jira/browse/HIVE-15772
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Affects Versions: 2.2.0
> Environment: set the exception into SparkJobStatus if exception 
> happened in RemoteSparkJobMonitor and LocalSparkJobMonitor. Add function 
> setError in SparkJobStatus.
>Reporter: zhihai xu
>Assignee: zhihai xu
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15517) NOT (x <=> y) returns NULL if x or y is NULL

2017-01-31 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847739#comment-15847739
 ] 

Pengcheng Xiong commented on HIVE-15517:


The bug comes from constantfolding. equalns extends equal, but equal has a 
negative function of notequal. As a result, equalns also inherits notequal as 
the negative function. 

> NOT (x <=> y) returns NULL if x or y is NULL
> 
>
> Key: HIVE-15517
> URL: https://issues.apache.org/jira/browse/HIVE-15517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor, SQL
>Affects Versions: 1.2.1
>Reporter: Alexey Bedrintsev
>Assignee: Pengcheng Xiong
>
> I created a table as following:
> create table test(x string, y string);
> insert into test values ('q', 'q'), ('q', 'w'), (NULL, 'q'), ('q', NULL), 
> (NULL, NULL);
> Then I try to compare values taking NULLs into account:
> select *, x<=>y, not (x<=> y), (x <=> y) = false from test;
> OK
> q   q   truefalse   false
> q   w   false   truetrue
> q   NULLfalse   NULLtrue
> NULLq   false   NULLtrue
> NULLNULLtrueNULLfalse
> I expected that 4th column will be the same as 5th one but actually got NULL 
> as result of "not false" and "not true" expressions.
> Hive 1.2.1000.2.5.0.0-1245
> Subversion 
> git://c66-slave-20176e25-3/grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive
>  -r da6c690d384d1666f5a5f450be5cbc54e2fe4bd6
> Compiled by jenkins on Fri Aug 26 01:39:52 UTC 2016
> From source with checksum c30648316a632f7a753f4359e5c8f4d6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15517) NOT (x <=> y) returns NULL if x or y is NULL

2017-01-31 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-15517:
--

Assignee: Pengcheng Xiong

> NOT (x <=> y) returns NULL if x or y is NULL
> 
>
> Key: HIVE-15517
> URL: https://issues.apache.org/jira/browse/HIVE-15517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Operators, Query Processor, SQL
>Affects Versions: 1.2.1
>Reporter: Alexey Bedrintsev
>Assignee: Pengcheng Xiong
>
> I created a table as following:
> create table test(x string, y string);
> insert into test values ('q', 'q'), ('q', 'w'), (NULL, 'q'), ('q', NULL), 
> (NULL, NULL);
> Then I try to compare values taking NULLs into account:
> select *, x<=>y, not (x<=> y), (x <=> y) = false from test;
> OK
> q   q   truefalse   false
> q   w   false   truetrue
> q   NULLfalse   NULLtrue
> NULLq   false   NULLtrue
> NULLNULLtrueNULLfalse
> I expected that 4th column will be the same as 5th one but actually got NULL 
> as result of "not false" and "not true" expressions.
> Hive 1.2.1000.2.5.0.0-1245
> Subversion 
> git://c66-slave-20176e25-3/grid/0/jenkins/workspace/HDP-parallel-centos6/SOURCES/hive
>  -r da6c690d384d1666f5a5f450be5cbc54e2fe4bd6
> Compiled by jenkins on Fri Aug 26 01:39:52 UTC 2016
> From source with checksum c30648316a632f7a753f4359e5c8f4d6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15771) CBO chooses wrong join order for TPC-DS query72

2017-01-31 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15771:
-
Description: 
Query72 of TPC-DS on 1TB scale generates wrong join order resulting in 
increased query execution time. It chooses fact-to-fact table join followed by 
joins with dimension tables as opposed to doing map-join with dimension tables 
first and doing the fact-to-fact table join at the last.

Please find attachment for the join order selected by CBO vs rewritten query 
with expected join order. 

  was:
Query72 of TPC-DS on 1TB scale generates wrong join order resulting in 
increased query execution time. It chooses fact-to-fact table join followed by 
joins with dimension tables as opposed to doing map-join with dimension tables 
first and doing the fact-to-fact table join at the last.

Please find attachment for the join order selected by CBO vs rewritten query 
with correct join order. 


> CBO chooses wrong join order for TPC-DS query72
> ---
>
> Key: HIVE-15771
> URL: https://issues.apache.org/jira/browse/HIVE-15771
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Priority: Critical
> Attachments: q72-explain.txt, q72-mod-explain.txt, q72-mod.svg, 
> q72.svg, query72-mod.sql, query72.sql
>
>
> Query72 of TPC-DS on 1TB scale generates wrong join order resulting in 
> increased query execution time. It chooses fact-to-fact table join followed 
> by joins with dimension tables as opposed to doing map-join with dimension 
> tables first and doing the fact-to-fact table join at the last.
> Please find attachment for the join order selected by CBO vs rewritten query 
> with expected join order. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15760) TezCompiler throws ConcurrentModificationException during cycle detection

2017-01-31 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847717#comment-15847717
 ] 

Jason Dere commented on HIVE-15760:
---

+1

> TezCompiler throws ConcurrentModificationException during cycle detection
> -
>
> Key: HIVE-15760
> URL: https://issues.apache.org/jira/browse/HIVE-15760
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Deepak Jaiswal
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE.15760.1.patch
>
>
> HIVE-15269 is causing the following exception when we run explain on query72 
> (TPC-DS).
> {code}
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> at java.util.ArrayList$Itr.next(ArrayList.java:851)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler$SemiJoinCycleRemovalDueToMapsideJoins.process(TezCompiler.java:689)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.removeSemiJoinCyclesDueToMapsideJoins(TezCompiler.java:759)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:112)
> at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:140)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11122)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:275)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:129)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:513)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1305)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1445)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15754) exchange partition is not generating notifications

2017-01-31 Thread Nachiket Vaidya (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847714#comment-15847714
 ] 

Nachiket Vaidya commented on HIVE-15754:


[~mohitsabharwal], [~vgumashta], [~sushanth]: can you please review the change?


> exchange partition is not generating notifications
> --
>
> Key: HIVE-15754
> URL: https://issues.apache.org/jira/browse/HIVE-15754
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 2.2.0
>Reporter: Nachiket Vaidya
>Assignee: Nachiket Vaidya
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15754.0.patch
>
>
> exchange partition event is not generating notifications in notification_log.
> There should multiple events generated. one add_partition event and several 
> drop_partition events.
> for example:
> {noformat}
> ALTER TABLE tab1 EXCHANGE PARTITION (part=1) WITH TABLE tab2;
> {noformat}
> There should be the following events:
> ADD_PARTITION on tab2 on partition (part=1)
> DROP_PARTITION on tab1 on partition (part=1)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15748) Remove cycles created due to semi join branch and map join Op on same operator pipeline

2017-01-31 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-15748:
--
Attachment: HIVE-15748.3.patch

Implemented review comments

> Remove cycles created due to semi join branch and map join Op on same 
> operator pipeline
> ---
>
> Key: HIVE-15748
> URL: https://issues.apache.org/jira/browse/HIVE-15748
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-15748.1.patch, HIVE-15748.2.patch, 
> HIVE-15748.3.patch
>
>
> If a semi join branch and map join operator are on same operator pipeline, 
> then there could be a cycle created. Where the other map feeding into the 
> mapjoin operator is waiting for the semi join branch to finish causing a 
> cycle.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15703) HiveSubQRemoveRelBuilder should use Hive's own factories

2017-01-31 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15703:
---
Status: Open  (was: Patch Available)

> HiveSubQRemoveRelBuilder should use Hive's own factories
> 
>
> Key: HIVE-15703
> URL: https://issues.apache.org/jira/browse/HIVE-15703
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Vineet Garg
> Attachments: HIVE-15703.01.patch, HIVE-15703.2.patch, 
> HIVE-15703.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15703) HiveSubQRemoveRelBuilder should use Hive's own factories

2017-01-31 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15703:
---
Status: Patch Available  (was: Open)

> HiveSubQRemoveRelBuilder should use Hive's own factories
> 
>
> Key: HIVE-15703
> URL: https://issues.apache.org/jira/browse/HIVE-15703
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Vineet Garg
> Attachments: HIVE-15703.01.patch, HIVE-15703.2.patch, 
> HIVE-15703.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15703) HiveSubQRemoveRelBuilder should use Hive's own factories

2017-01-31 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-15703:
---
Attachment: HIVE-15703.3.patch

> HiveSubQRemoveRelBuilder should use Hive's own factories
> 
>
> Key: HIVE-15703
> URL: https://issues.apache.org/jira/browse/HIVE-15703
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Vineet Garg
> Attachments: HIVE-15703.01.patch, HIVE-15703.2.patch, 
> HIVE-15703.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15771) CBO chooses wrong join order for TPC-DS query72

2017-01-31 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847698#comment-15847698
 ] 

Prasanth Jayachandran commented on HIVE-15771:
--

cc/ [~ashutoshc] [~jcamachorodriguez] [~pxiong]

> CBO chooses wrong join order for TPC-DS query72
> ---
>
> Key: HIVE-15771
> URL: https://issues.apache.org/jira/browse/HIVE-15771
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Priority: Critical
> Attachments: q72-explain.txt, q72-mod-explain.txt, q72-mod.svg, 
> q72.svg, query72-mod.sql, query72.sql
>
>
> Query72 of TPC-DS on 1TB scale generates wrong join order resulting in 
> increased query execution time. It chooses fact-to-fact table join followed 
> by joins with dimension tables as opposed to doing map-join with dimension 
> tables first and doing the fact-to-fact table join at the last.
> Please find attachment for the join order selected by CBO vs rewritten query 
> with correct join order. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15771) CBO chooses wrong join order for TPC-DS query72

2017-01-31 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15771:
-
Attachment: query72.sql
query72-mod.sql
q72.svg
q72-mod.svg
q72-mod-explain.txt
q72-explain.txt

> CBO chooses wrong join order for TPC-DS query72
> ---
>
> Key: HIVE-15771
> URL: https://issues.apache.org/jira/browse/HIVE-15771
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Priority: Critical
> Attachments: q72-explain.txt, q72-mod-explain.txt, q72-mod.svg, 
> q72.svg, query72-mod.sql, query72.sql
>
>
> Query72 of TPC-DS on 1TB scale generates wrong join order resulting in 
> increased query execution time. It chooses fact-to-fact table join followed 
> by joins with dimension tables as opposed to doing map-join with dimension 
> tables first and doing the fact-to-fact table join at the last.
> Please find attachment for the join order selected by CBO vs rewritten query 
> with correct join order. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HIVE-15770) Test jira

2017-01-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-15770.
-
Resolution: Information Provided

> Test jira
> -
>
> Key: HIVE-15770
> URL: https://issues.apache.org/jira/browse/HIVE-15770
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HIVE-15770) Test jira

2017-01-31 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned HIVE-15770:
-


> Test jira
> -
>
> Key: HIVE-15770
> URL: https://issues.apache.org/jira/browse/HIVE-15770
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15764) Tez session doesn't get released TezSessionPool on query cancellation

2017-01-31 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847692#comment-15847692
 ] 

Siddharth Seth commented on HIVE-15764:
---

[~thejas] - HIVE-15731 was similar? Is this being seen after that as well?

> Tez session doesn't get released TezSessionPool on query cancellation
> -
>
> Key: HIVE-15764
> URL: https://issues.apache.org/jira/browse/HIVE-15764
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-15764.1.patch
>
>
> With HiveServer2, tez execution engine, and 
> hive.server2.tez.initialize.default.sessions=true, if a query is cancelled 
> via jdbc, the tez session doesn't get released back to the session pool.
> This can cause the query processing of new queries to get stuck, waiting for 
> a tez session to be available.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15160) Can't order by an unselected column

2017-01-31 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847675#comment-15847675
 ] 

Pengcheng Xiong commented on HIVE-15160:


[~vgarg], the RB is now available.

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch, HIVE-15160.02.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15160) Can't order by an unselected column

2017-01-31 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847671#comment-15847671
 ] 

Pengcheng Xiong commented on HIVE-15160:


I have updated all the golden files. It seems that it also corrected one error 
in cbo_limit.q

> Can't order by an unselected column
> ---
>
> Key: HIVE-15160
> URL: https://issues.apache.org/jira/browse/HIVE-15160
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15160.01.patch, HIVE-15160.02.patch
>
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>,i_category
>,i_class
>,i_current_price
>,sum(cs_ext_sales_price) as itemrevenue
>,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>(partition by i_class) as revenueratio
>  from catalog_sales
>  ,item
>  ,date_dim
>  where cs_item_sk = i_item_sk
>and i_category in ('Jewelry', 'Sports', 'Books')
>and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>   and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>  ,i_item_desc
>  ,i_category
>  ,i_class
>  ,i_current_price
>  order by i_category
>  ,i_class
>  ,i_item_id
>  ,i_item_desc
>  ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15700) BytesColumnVector can get stuck trying to resize byte buffer

2017-01-31 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847641#comment-15847641
 ] 

Jason Dere commented on HIVE-15700:
---

failures not related

> BytesColumnVector can get stuck trying to resize byte buffer
> 
>
> Key: HIVE-15700
> URL: https://issues.apache.org/jira/browse/HIVE-15700
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15700.1.patch, HIVE-15700.2.patch, 
> HIVE-15700.3.patch, HIVE-15700.4.patch
>
>
> While looking at HIVE-15698, hit an issue where one of the reducers was stuck 
> in the following stack trace:
> {noformat}
> Thread 12735: (state = IN_JAVA)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.increaseBufferSpace(int)
>  @bci=22, line=245 (Compiled frame; information may be imprecise)
>  - org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(int, 
> byte[], int, int) @bci=18, line=150 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch,
>  int, int, boolean) @bci=536, line=442 (Compiled frame)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch,
>  int) @bci=110, line=761 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(org.apache.hadoop.io.BytesWritable,
>  java.lang.Iterable, byte) @bci=184, line=444 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector() 
> @bci=119, line=388 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord() @bci=8, 
> line=239 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run() @bci=124, 
> line=319 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(java.util.Map,
>  java.util.Map) @bci=30, line=185 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(java.util.Map, 
> java.util.Map) @bci=159, line=168 (Interpreted frame)
>  - org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run() @bci=65, 
> line=370 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable$1.run() @bci=133, line=73 
> (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable$1.run() @bci=1, line=61 
> (Interpreted frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
>  java.security.AccessControlContext) @bci=0 (Compiled frame)
>  - javax.security.auth.Subject.doAs(javax.security.auth.Subject, 
> java.security.PrivilegedExceptionAction) @bci=42, line=422 (Interpreted frame)
>  - 
> org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
>  @bci=14, line=1724 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable.callInternal() @bci=38, 
> line=61 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable.callInternal() @bci=1, 
> line=37 (Interpreted frame)
>  - org.apache.tez.common.CallableWithNdc.call() @bci=8, line=36 (Interpreted 
> frame)
>  - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Interpreted frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1142 (Interpreted frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 
> (Interpreted frame)
>  - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
> {noformat}
> The reducer's input was 167 9MB binary values coming from the previous map 
> job. Per [~gopalv] the BytesColumnVector is stuck trying to reallocate/copy 
> all of these values into the same memory buffer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15700) BytesColumnVector can get stuck trying to resize byte buffer

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847633#comment-15847633
 ] 

Hive QA commented on HIVE-15700:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850309/HIVE-15700.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11016 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3288/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3288/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3288/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850309 - PreCommit-HIVE-Build

> BytesColumnVector can get stuck trying to resize byte buffer
> 
>
> Key: HIVE-15700
> URL: https://issues.apache.org/jira/browse/HIVE-15700
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15700.1.patch, HIVE-15700.2.patch, 
> HIVE-15700.3.patch, HIVE-15700.4.patch
>
>
> While looking at HIVE-15698, hit an issue where one of the reducers was stuck 
> in the following stack trace:
> {noformat}
> Thread 12735: (state = IN_JAVA)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.increaseBufferSpace(int)
>  @bci=22, line=245 (Compiled frame; information may be imprecise)
>  - org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(int, 
> byte[], int, int) @bci=18, line=150 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch,
>  int, int, boolean) @bci=536, line=442 (Compiled frame)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch,
>  int) @bci=110, line=761 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(org.apache.hadoop.io.BytesWritable,
>  java.lang.Iterable, byte) @bci=184, line=444 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector() 
> @bci=119, line=388 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord() @bci=8, 
> line=239 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run() @bci=124, 
> line=319 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(java.util.Map,
>  java.util.Map) @bci=30, line=185 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(java.util.Map, 
> java.util.Map) @bci=159, line=168 (Interpreted frame)
>  - org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run() @bci=65, 
> line=370 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable$1.run() @bci=133, line=73 
> (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable$1.run() @bci=1, line=61 
> (Interpreted frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
>  java.security.AccessControlContext) @bci=0 (Compiled frame)
>  - javax.security.auth.Subject.doAs(javax.security.auth.Subject, 
> java.security.PrivilegedExceptionAction) @bci=42, line=422 (Interpreted frame)
>  - 
> org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
>  @bci=14, line=1724 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable.callInternal() @bci=38, 
> line=61 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable.callInternal() @bci=1, 
> line=37 (Interpreted frame)
>  - org.apache.tez.common.CallableWithNdc.call() @bci=8, line=36 (Interpreted 
> frame)
>  - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Interpreted frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, 

[jira] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-31 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847600#comment-15847600
 ] 

Chaoyu Tang commented on HIVE-15653:


the STATS_GENERATED_VIA_STATS_TASK used in your workaround has been removed in 
HIVE-12730, you may run into some issues when you upgrade to use Hive 2.1.

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch, 
> HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch, 
> HIVE-15653.6.patch, HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15755) Beeline throws NPE on invalid tablename

2017-01-31 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-15755:
--
Component/s: Transactions

> Beeline throws NPE on invalid tablename
> ---
>
> Key: HIVE-15755
> URL: https://issues.apache.org/jira/browse/HIVE-15755
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Reporter: Kavan Suresh
>Assignee: Eugene Koifman
>Priority: Critical
>
> Ran into this error message - "Error while compiling statement: FAILED: 
> NullPointerException null " when I specified an incorrect tablename in the 
> merge statement.
>  
> {code:java}
> > create table src (col1 int,col2 int);
> No rows affected (0.231 seconds)
> > create table trgt (tcol1 int,tcol2 int);
> No rows affected (0.182 seconds)
> > insert into src values (1,232);
> {code}
> {code:java}
> > merge into trgt using (select * from src) sub on sub.col1 = 
> > *invalidtablename.tcol1* when not matched then insert values 
> > (sub.col1,sub.col2);
> Error: Error while compiling statement: FAILED: NullPointerException null 
> (state=42000,code=4)
> > merge into trgt using (select * from src) sub on sub.col1 = *trgt.tcol1* 
> > when not matched then insert values (sub.col1,sub.col2);
> INFO  : Session is already open
> INFO  : Dag name: merge into trgt using ...(sub.col1,sub.col2)(Stage-1)
> INFO  : Setting tez.task.scale.memory.reserve-fraction to 0.3001192092896
> INFO  : 
> INFO  : Status: Running (Executing on YARN cluster with App id 
> application_1485398058799_0129)
> INFO  : Map 1: 0/1Map 2: -/-  
> INFO  : Map 1: 0(+1)/1Map 2: -/-  
> INFO  : Map 1: 0(+1)/1Map 2: -/-  
> INFO  : Map 1: 1/1Map 2: -/-  
> INFO  : Loading data to table tpch.trgt from 
> hdfs://tesths2-merge-ks-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/trgt/.hive-staging_hive_2017-01-30_06-54-50_743_6276941178188398287-1/-ext-1
> INFO  : Table tpch.trgt stats: [numFiles=1, numRows=1, totalSize=4, 
> rawDataSize=3]
> No rows affected (7.709 seconds)
> {code}
> Hiveserver2 logs:
> {code:java}
> 2017-01-30 19:34:09,972 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: merge into 
> trgt using (select * from src) sub on sub.col1 = target.tcol1 when not 
> matched then insert values (sub.col1,sub.col2)
> 2017-01-30 19:34:09,975 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> parse.ParseDriver (ParseDriver.java:parse(209)) - Parse Completed
> 2017-01-30 19:34:09,976 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(177)) -  start=1485804849971 end=1485804849976 duration=5 
> from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-30 19:34:09,976 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(149)) -  method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-30 19:34:09,977 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> metastore.HiveMetaStore (HiveMetaStore.java:logInfo(824)) - 13: get_table : 
> db=tpch tbl=trgt
> 2017-01-30 19:34:09,977 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(393)) - ugi=hive 
> ip=unknown-ip-addr  cmd=get_table : db=tpch tbl=trgt
> 2017-01-30 19:34:10,031 ERROR [HiveServer2-Handler-Pool: Thread-70]: 
> ql.Driver (SessionState.java:printError(980)) - FAILED: NullPointerException 
> null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer$OnClauseAnalyzer.getPredicate(UpdateDeleteSemanticAnalyzer.java:1143)
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer$OnClauseAnalyzer.access$400(UpdateDeleteSemanticAnalyzer.java:1049)
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer.handleInsert(UpdateDeleteSemanticAnalyzer.java:1025)
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer.analyzeMerge(UpdateDeleteSemanticAnalyzer.java:660)
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer.analyzeInternal(UpdateDeleteSemanticAnalyzer.java:80)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:230)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1221)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1215)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226)
> at 
> 

[jira] (HIVE-15755) Beeline throws NPE on invalid tablename

2017-01-31 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-15755:
-

Assignee: Eugene Koifman

> Beeline throws NPE on invalid tablename
> ---
>
> Key: HIVE-15755
> URL: https://issues.apache.org/jira/browse/HIVE-15755
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Reporter: Kavan Suresh
>Assignee: Eugene Koifman
>Priority: Critical
>
> Ran into this error message - "Error while compiling statement: FAILED: 
> NullPointerException null " when I specified an incorrect tablename in the 
> merge statement.
>  
> {code:java}
> > create table src (col1 int,col2 int);
> No rows affected (0.231 seconds)
> > create table trgt (tcol1 int,tcol2 int);
> No rows affected (0.182 seconds)
> > insert into src values (1,232);
> {code}
> {code:java}
> > merge into trgt using (select * from src) sub on sub.col1 = 
> > *invalidtablename.tcol1* when not matched then insert values 
> > (sub.col1,sub.col2);
> Error: Error while compiling statement: FAILED: NullPointerException null 
> (state=42000,code=4)
> > merge into trgt using (select * from src) sub on sub.col1 = *trgt.tcol1* 
> > when not matched then insert values (sub.col1,sub.col2);
> INFO  : Session is already open
> INFO  : Dag name: merge into trgt using ...(sub.col1,sub.col2)(Stage-1)
> INFO  : Setting tez.task.scale.memory.reserve-fraction to 0.3001192092896
> INFO  : 
> INFO  : Status: Running (Executing on YARN cluster with App id 
> application_1485398058799_0129)
> INFO  : Map 1: 0/1Map 2: -/-  
> INFO  : Map 1: 0(+1)/1Map 2: -/-  
> INFO  : Map 1: 0(+1)/1Map 2: -/-  
> INFO  : Map 1: 1/1Map 2: -/-  
> INFO  : Loading data to table tpch.trgt from 
> hdfs://tesths2-merge-ks-5.openstacklocal:8020/apps/hive/warehouse/tpch.db/trgt/.hive-staging_hive_2017-01-30_06-54-50_743_6276941178188398287-1/-ext-1
> INFO  : Table tpch.trgt stats: [numFiles=1, numRows=1, totalSize=4, 
> rawDataSize=3]
> No rows affected (7.709 seconds)
> {code}
> Hiveserver2 logs:
> {code:java}
> 2017-01-30 19:34:09,972 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: merge into 
> trgt using (select * from src) sub on sub.col1 = target.tcol1 when not 
> matched then insert values (sub.col1,sub.col2)
> 2017-01-30 19:34:09,975 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> parse.ParseDriver (ParseDriver.java:parse(209)) - Parse Completed
> 2017-01-30 19:34:09,976 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(177)) -  start=1485804849971 end=1485804849976 duration=5 
> from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-30 19:34:09,976 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(149)) -  method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
> 2017-01-30 19:34:09,977 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> metastore.HiveMetaStore (HiveMetaStore.java:logInfo(824)) - 13: get_table : 
> db=tpch tbl=trgt
> 2017-01-30 19:34:09,977 INFO  [HiveServer2-Handler-Pool: Thread-70]: 
> HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(393)) - ugi=hive 
> ip=unknown-ip-addr  cmd=get_table : db=tpch tbl=trgt
> 2017-01-30 19:34:10,031 ERROR [HiveServer2-Handler-Pool: Thread-70]: 
> ql.Driver (SessionState.java:printError(980)) - FAILED: NullPointerException 
> null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer$OnClauseAnalyzer.getPredicate(UpdateDeleteSemanticAnalyzer.java:1143)
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer$OnClauseAnalyzer.access$400(UpdateDeleteSemanticAnalyzer.java:1049)
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer.handleInsert(UpdateDeleteSemanticAnalyzer.java:1025)
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer.analyzeMerge(UpdateDeleteSemanticAnalyzer.java:660)
> at 
> org.apache.hadoop.hive.ql.parse.UpdateDeleteSemanticAnalyzer.analyzeInternal(UpdateDeleteSemanticAnalyzer.java:80)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:230)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1221)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1215)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226)
> at 

[jira] (HIVE-15723) Hive should report a warning about missing table/column statistics to user.

2017-01-31 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reopened HIVE-15723:
-

> Hive should report a warning about missing table/column statistics to user.
> ---
>
> Key: HIVE-15723
> URL: https://issues.apache.org/jira/browse/HIVE-15723
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15723.01.patch, HIVE-15723.02.patch, 
> HIVE-15723.03.patch, HIVE-15723.04.patch
>
>
> Many Hive performance issues are due to missing statistics. Either all, table 
> or column statistics are missing. Potentially a new partition has been added 
> and customer forgot to gather stats for that partition.
> A simple warning about a table or column missing statistics can be very 
> helpful and makes hive more user friendly. Hive already has this information, 
> its a matter of printing it out.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15723) Hive should report a warning about missing table/column statistics to user.

2017-01-31 Thread Nita Dembla (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847570#comment-15847570
 ] 

Nita Dembla commented on HIVE-15723:


hive.cbo.show.warnings should be set to to 'true' by default, since the user 
may not really know if he is missing statistics and may not even look for this 
setting. I couldn't reopen the bug. 

> Hive should report a warning about missing table/column statistics to user.
> ---
>
> Key: HIVE-15723
> URL: https://issues.apache.org/jira/browse/HIVE-15723
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15723.01.patch, HIVE-15723.02.patch, 
> HIVE-15723.03.patch, HIVE-15723.04.patch
>
>
> Many Hive performance issues are due to missing statistics. Either all, table 
> or column statistics are missing. Potentially a new partition has been added 
> and customer forgot to gather stats for that partition.
> A simple warning about a table or column missing statistics can be very 
> helpful and makes hive more user friendly. Hive already has this information, 
> its a matter of printing it out.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-31 Thread Alexander Behm (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847540#comment-15847540
 ] 

Alexander Behm commented on HIVE-15653:
---

Thanks, [~ctang.ma]. I've already added a workaround to Impala, but not using 
StatsSetupConst.DO_NOT_UPDATE_STATS because sometimes we do want to update 
stats (e.g. compute stats).
See: http://gerrit.cloudera.org:8080/5731

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch, 
> HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch, 
> HIVE-15653.6.patch, HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-31 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847536#comment-15847536
 ] 

Chaoyu Tang commented on HIVE-15653:


Yes, unfortunately, [~alex.behm]. But if you pass 
StatsSetupConst.DO_NOT_UPDATE_STATS TRUE as a table parameter, the alter_table 
should not drop the stats as well I think.

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch, 
> HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch, 
> HIVE-15653.6.patch, HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15717) JDBC: Implement rowDeleted, rowInserted and rowUpdated to return false

2017-01-31 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15717:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Test failures are accounted for under the umbrella jira for test failures.
Committed patch to master.


> JDBC: Implement rowDeleted, rowInserted and rowUpdated to return false
> --
>
> Key: HIVE-15717
> URL: https://issues.apache.org/jira/browse/HIVE-15717
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 2.2.0
>
> Attachments: HIVE-15717.1.patch, HIVE-15717.2.patch, Screen Shot 
> 2017-01-24 at 3.11.09 PM.png, Screen Shot 2017-01-24 at 3.15.14 PM.png
>
>
> In performance profile of beeline with hive jdbc driver, it is seen that lot 
> of time is spent in Class "org.apache.hive.beeline.Rows.Row" constructor, due 
> to exception handling.
> The exception handling from the 3 methods calls (rowDeleted, rowInserted and 
> rowUpdated). The implementation of these methods in 
> org.apache.hive.jdbc.HiveBaseResultSet class is just throwing 
> SQLException("Method not supported”), i.e. no real implementations.
> Implementing these methods to return false instead of throwing exception will 
> help improve the performance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15717) Class "org.apache.hive.beeline.Rows.Row" constructor is CPU consuming due to exception handling

2017-01-31 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15717:
-
Description: 
In performance profile of beeline with hive jdbc driver, it is seen that lot of 
time is spent in Class "org.apache.hive.beeline.Rows.Row" constructor, due to 
exception handling.
The exception handling from the 3 methods calls (rowDeleted, rowInserted and 
rowUpdated). The implementation of these methods in 
org.apache.hive.jdbc.HiveBaseResultSet class is just throwing 
SQLException("Method not supported”), i.e. no real implementations.

Implementing these methods to return false instead of throwing exception will 
help improve the performance.

  was:The exception handling from the 3 methods calls (rowDeleted, rowInserted 
and rowUpdated). The implementation of these methods in 
org.apache.hive.jdbc.HiveBaseResultSet class is just throwing 
SQLException("Method not supported”), i.e. no real implementations.


> Class "org.apache.hive.beeline.Rows.Row" constructor is CPU consuming due to 
> exception handling
> ---
>
> Key: HIVE-15717
> URL: https://issues.apache.org/jira/browse/HIVE-15717
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-15717.1.patch, HIVE-15717.2.patch, Screen Shot 
> 2017-01-24 at 3.11.09 PM.png, Screen Shot 2017-01-24 at 3.15.14 PM.png
>
>
> In performance profile of beeline with hive jdbc driver, it is seen that lot 
> of time is spent in Class "org.apache.hive.beeline.Rows.Row" constructor, due 
> to exception handling.
> The exception handling from the 3 methods calls (rowDeleted, rowInserted and 
> rowUpdated). The implementation of these methods in 
> org.apache.hive.jdbc.HiveBaseResultSet class is just throwing 
> SQLException("Method not supported”), i.e. no real implementations.
> Implementing these methods to return false instead of throwing exception will 
> help improve the performance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15717) JDBC: Implement rowDeleted, rowInserted and rowUpdated to return false

2017-01-31 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15717:
-
Summary: JDBC: Implement rowDeleted, rowInserted and rowUpdated to return 
false  (was: Class "org.apache.hive.beeline.Rows.Row" constructor is CPU 
consuming due to exception handling)

> JDBC: Implement rowDeleted, rowInserted and rowUpdated to return false
> --
>
> Key: HIVE-15717
> URL: https://issues.apache.org/jira/browse/HIVE-15717
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-15717.1.patch, HIVE-15717.2.patch, Screen Shot 
> 2017-01-24 at 3.11.09 PM.png, Screen Shot 2017-01-24 at 3.15.14 PM.png
>
>
> In performance profile of beeline with hive jdbc driver, it is seen that lot 
> of time is spent in Class "org.apache.hive.beeline.Rows.Row" constructor, due 
> to exception handling.
> The exception handling from the 3 methods calls (rowDeleted, rowInserted and 
> rowUpdated). The implementation of these methods in 
> org.apache.hive.jdbc.HiveBaseResultSet class is just throwing 
> SQLException("Method not supported”), i.e. no real implementations.
> Implementing these methods to return false instead of throwing exception will 
> help improve the performance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15700) BytesColumnVector can get stuck trying to resize byte buffer

2017-01-31 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-15700:
--
Attachment: HIVE-15700.4.patch

Moving reset of bufferAllocationCount to initBuffer() per [~mmccline]'s 
suggestion

> BytesColumnVector can get stuck trying to resize byte buffer
> 
>
> Key: HIVE-15700
> URL: https://issues.apache.org/jira/browse/HIVE-15700
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-15700.1.patch, HIVE-15700.2.patch, 
> HIVE-15700.3.patch, HIVE-15700.4.patch
>
>
> While looking at HIVE-15698, hit an issue where one of the reducers was stuck 
> in the following stack trace:
> {noformat}
> Thread 12735: (state = IN_JAVA)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.increaseBufferSpace(int)
>  @bci=22, line=245 (Compiled frame; information may be imprecise)
>  - org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(int, 
> byte[], int, int) @bci=18, line=150 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch,
>  int, int, boolean) @bci=536, line=442 (Compiled frame)
>  - 
> org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch,
>  int) @bci=110, line=761 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(org.apache.hadoop.io.BytesWritable,
>  java.lang.Iterable, byte) @bci=184, line=444 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector() 
> @bci=119, line=388 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord() @bci=8, 
> line=239 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run() @bci=124, 
> line=319 (Interpreted frame)
>  - 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(java.util.Map,
>  java.util.Map) @bci=30, line=185 (Interpreted frame)
>  - org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(java.util.Map, 
> java.util.Map) @bci=159, line=168 (Interpreted frame)
>  - org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run() @bci=65, 
> line=370 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable$1.run() @bci=133, line=73 
> (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable$1.run() @bci=1, line=61 
> (Interpreted frame)
>  - 
> java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
>  java.security.AccessControlContext) @bci=0 (Compiled frame)
>  - javax.security.auth.Subject.doAs(javax.security.auth.Subject, 
> java.security.PrivilegedExceptionAction) @bci=42, line=422 (Interpreted frame)
>  - 
> org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
>  @bci=14, line=1724 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable.callInternal() @bci=38, 
> line=61 (Interpreted frame)
>  - org.apache.tez.runtime.task.TaskRunner2Callable.callInternal() @bci=1, 
> line=37 (Interpreted frame)
>  - org.apache.tez.common.CallableWithNdc.call() @bci=8, line=36 (Interpreted 
> frame)
>  - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Interpreted frame)
>  - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
>  @bci=95, line=1142 (Interpreted frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 
> (Interpreted frame)
>  - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
> {noformat}
> The reducer's input was 167 9MB binary values coming from the previous map 
> job. Per [~gopalv] the BytesColumnVector is stuck trying to reallocate/copy 
> all of these values into the same memory buffer.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-31 Thread Alexander Behm (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847514#comment-15847514
 ] 

Alexander Behm commented on HIVE-15653:
---

Thanks for the info, [~ctang.ma]! I agree the 
alter_table_with_environmentContext() seems useful. 

Is my understanding correct that even after your patch the HMS alter_table() 
will drop stats in some cases?

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch, 
> HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch, 
> HIVE-15653.6.patch, HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15653) Some ALTER TABLE commands drop table stats

2017-01-31 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847450#comment-15847450
 ] 

Chaoyu Tang commented on HIVE-15653:


[~alex.behm] The patch provided here is mainly to fix some Hive alter table 
DDLs and ensure them not to drop table stats accidentally. There was not much 
change in HMS. However, HIVE-12730 introduced some new HMS APIs (e.g. void 
alter_table_with_environmentContext) which allow user to pass a flag 
StatsSetupConst.DO_NOT_UPDATE_STATS true to HMS and indicate it not to update 
the stats. These new APIs could probably be used in Impala. BTW, the 
StatsSetupConst.STATS_GENERATED_VIA_STATS_TASK has been dropped in HIVE-12730.

> Some ALTER TABLE commands drop table stats
> --
>
> Key: HIVE-15653
> URL: https://issues.apache.org/jira/browse/HIVE-15653
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Alexander Behm
>Assignee: Chaoyu Tang
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15653.1.patch, HIVE-15653.2.patch, 
> HIVE-15653.3.patch, HIVE-15653.4.patch, HIVE-15653.5.patch, 
> HIVE-15653.6.patch, HIVE-15653.patch
>
>
> Some ALTER TABLE commands drop the table stats. That may make sense for some 
> ALTER TABLE operations, but certainly not for others. Personally, I I think 
> ALTER TABLE should only change what was requested by the user without any 
> side effects that may be unclear to users. In particular, collecting stats 
> can be an expensive operation so it's rather inconvenient for users if they 
> get wiped accidentally.
> Repro:
> {code}
> create table t (i int);
> insert into t values(1);
> analyze table t compute statistics;
> alter table t set tblproperties('test'='test');
> hive> describe formatted t;
> OK
> # col_namedata_type   comment 
>
> i int 
>
> # Detailed Table Information   
> Database: default  
> Owner:abehm
> CreateTime:   Tue Jan 17 18:13:34 PST 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Retention:0
> Location: hdfs://localhost:20500/test-warehouse/t  
> Table Type:   MANAGED_TABLE
> Table Parameters:  
>   COLUMN_STATS_ACCURATE   false   
>   last_modified_byabehm   
>   last_modified_time  1484705748  
>   numFiles1   
>   numRows -1  
>   rawDataSize -1  
>   testtest
>   totalSize   2   
>   transient_lastDdlTime   1484705748  
>
> # Storage Information  
> SerDe Library:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
>  
> InputFormat:  org.apache.hadoop.mapred.TextInputFormat 
> OutputFormat: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
> Compressed:   No   
> Num Buckets:  -1   
> Bucket Columns:   []   
> Sort Columns: []   
> Storage Desc Params:   
>   serialization.format1   
> Time taken: 0.169 seconds, Fetched: 34 row(s)
> {code}
> The same behavior can be observed with several other ALTER TABLE commands.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-14086) org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro schema file

2017-01-31 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847433#comment-15847433
 ] 

Sergio Peña commented on HIVE-14086:


[~lv] I think you can get the actual schema from the following method: 
{{get_schema(String db, String tableName)}}

The {{get_schema}} method will return the partition columns (read from 
COLUMNS_V2) and the regular columns (read from the table serialization 
library). The serialization library will build the columns from what Hive uses 
as schema on the table. This schema should be based on the {{avro.schema.url}}

> org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro 
> schema file
> 
>
> Key: HIVE-14086
> URL: https://issues.apache.org/jira/browse/HIVE-14086
> Project: Hive
>  Issue Type: Bug
>  Components: API
>Reporter: Lars Volker
> Attachments: avro.json, avroremoved.json, avro.sql
>
>
> Consider this table, using an external Avro schema file:
> {noformat}
> CREATE TABLE avro_table
>   PARTITIONED BY (str_part STRING)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>   TBLPROPERTIES (
> 'avro.schema.url'='hdfs://localhost:20500/tmp/avro.json'
>   );
> {noformat}
> This will populate the "COLUMNS_V2" metastore table with the correct column 
> information (as per HIVE-6308). The columns of this table can then be queried 
> via the Hive API, for example by calling {{.getSd().getCols()}} on a 
> {{org.apache.hadoop.hive.metastore.api.Table}} object.
> Changes to the avro.schema.url file - either changing where it points to or 
> changing its contents - will be reflected in the output of {{describe 
> formatted avro_table}} *but not* in the result of the {{.getSd().getCols()}} 
> API call. Instead it looks like Hive only reads the Avro schema file 
> internally, but does not expose the information therein via its API.
> Is there a way to obtain the effective Table information via Hive? Would it 
> make sense to fix table retrieval so calls to {{get_table}} return the 
> correct set of columns?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15743) vectorized text parsing: speed up double parse

2017-01-31 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847429#comment-15847429
 ] 

Gopal V commented on HIVE-15743:



> String creation is actually more than half of the cost of going from byte[] 
> to double (see picture).

Specifically, it is a TLAB alloc miss. The code which is there is already ~10x 
faster, but we can do better by giving up String & assuming utf8 bytes always. 

Also a MutableDouble::parse() would let the system return a (Success, Value) 
tuple, which should allow for the fall back re-execution pathway to kick in for 
any failures.

> if it's rare and easy to detect, is to handle the 99% cases fast, and fall 
> back to Double.parse(new String) in exotic/rare cases.

I ran through all the data examples I have from various cases. The largest 
number of digits in raw data was 18 digits (15,2), with the most common Decimal 
source pattern hovering around (9,2).

None of them would be hit by a 2 ULP error, but we can always fall back to 
original parser for digits > 18.

> vectorized text parsing: speed up double parse
> --
>
> Key: HIVE-15743
> URL: https://issues.apache.org/jira/browse/HIVE-15743
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-15743.1.patch, HIVE-15743.2.patch, tpch-without.png
>
>
> {noformat}
> Double.parseDouble(
> new String(bytes, fieldStart, fieldLength, 
> StandardCharsets.UTF_8));{noformat}
> This takes ~25% of the query time in some cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15709) Vectorization: Fix performance issue with using LazyBinaryUtils.writeVInt and locking / thread local storage

2017-01-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15709:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Vectorization: Fix performance issue with using LazyBinaryUtils.writeVInt and 
> locking / thread local storage
> 
>
> Key: HIVE-15709
> URL: https://issues.apache.org/jira/browse/HIVE-15709
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-15709.01.patch, HIVE-15709.02.patch
>
>
> Showed up in performance analysis.  Easy solution: allocate temp VInt and use 
> it each time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15709) Vectorization: Fix performance issue with using LazyBinaryUtils.writeVInt and locking / thread local storage

2017-01-31 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847392#comment-15847392
 ] 

Matt McCline commented on HIVE-15709:
-

Committed to master.  Thanks Gopal!

> Vectorization: Fix performance issue with using LazyBinaryUtils.writeVInt and 
> locking / thread local storage
> 
>
> Key: HIVE-15709
> URL: https://issues.apache.org/jira/browse/HIVE-15709
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-15709.01.patch, HIVE-15709.02.patch
>
>
> Showed up in performance analysis.  Easy solution: allocate temp VInt and use 
> it each time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15709) Vectorization: Fix performance issue with using LazyBinaryUtils.writeVInt and locking / thread local storage

2017-01-31 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15709:

Fix Version/s: 2.2.0

> Vectorization: Fix performance issue with using LazyBinaryUtils.writeVInt and 
> locking / thread local storage
> 
>
> Key: HIVE-15709
> URL: https://issues.apache.org/jira/browse/HIVE-15709
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-15709.01.patch, HIVE-15709.02.patch
>
>
> Showed up in performance analysis.  Easy solution: allocate temp VInt and use 
> it each time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15768) Vectorized JSON UDF

2017-01-31 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847378#comment-15847378
 ] 

Teddy Choi commented on HIVE-15768:
---

HIVE-13562 is related and it solves some of this issue. However, UDF subclasses 
are still not vectorized under non-LLAP.

> Vectorized JSON UDF
> ---
>
> Key: HIVE-15768
> URL: https://issues.apache.org/jira/browse/HIVE-15768
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Minor
>
> A large enterprise use JSON heavily in PB scale. But its Hive queries won't 
> be vectorized because of JSON UDF. A vectorized JSON UDF will make it faster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15765) Support bracketed comments

2017-01-31 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847343#comment-15847343
 ] 

Hive QA commented on HIVE-15765:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850266/HIVE-15765.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 11015 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_11] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_1] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_2] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_3] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_4] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_5] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_6] 
(batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_7] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketcontext_8] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_on_varchar] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin9] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=29)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_11]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketmapjoin7]
 (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_mapjoin_15]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_char_simple]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketmapjoin7]
 (batchId=160)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin7] 
(batchId=108)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3]
 (batchId=107)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=109)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_15] 
(batchId=127)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3287/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3287/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3287/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850266 - PreCommit-HIVE-Build

> Support bracketed comments
> --
>
> Key: HIVE-15765
> URL: https://issues.apache.org/jira/browse/HIVE-15765
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-15765.1.patch, HIVE-15765.1.patch
>
>
> C-style comments are in the SQL spec as well as supported by all major DBs. 
> The are useful for inline annotation of the SQL. We should have them too.
> Example:
> {noformat}
> select
> /*+ MAPJOIN(a) */ /* mapjoin hint */
> a /* column */
> from foo join bar;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15743) vectorized text parsing: speed up double parse

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847333#comment-15847333
 ] 

Sergey Shelukhin edited comment on HIVE-15743 at 1/31/17 7:09 PM:
--

does there have to be a string, as well as toCharArray? The original code has 
bytes as input. String creation is actually more than half of the cost of going 
from byte[] to double (see picture).

Also, I am not sure how important the precision issue is; however, one thing to 
suggest, if it's rare and easy to detect, is to handle the 99% cases fast, and 
fall back to Double.parse(new String) in exotic/rare cases.



was (Author: sershe):
does there have to be a string, as well as toCharArray? The original code has 
bytes as input. String creation is actually more than half of the cost of going 
from byte[] to double (see picture).

> vectorized text parsing: speed up double parse
> --
>
> Key: HIVE-15743
> URL: https://issues.apache.org/jira/browse/HIVE-15743
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-15743.1.patch, HIVE-15743.2.patch, tpch-without.png
>
>
> {noformat}
> Double.parseDouble(
> new String(bytes, fieldStart, fieldLength, 
> StandardCharsets.UTF_8));{noformat}
> This takes ~25% of the query time in some cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-13816) Infer constants directly when we create semijoin

2017-01-31 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu reassigned HIVE-13816:
---

Assignee: Remus Rusanu  (was: Jesus Camacho Rodriguez)

> Infer constants directly when we create semijoin
> 
>
> Key: HIVE-13816
> URL: https://issues.apache.org/jira/browse/HIVE-13816
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Remus Rusanu
>
> Follow-up on HIVE-13068.
> When we create a left semijoin, we could infer the constants from the SEL 
> below when we create the GB to remove duplicates on the right hand side.
> Ex. ql/src/test/results/clientpositive/constprog_semijoin.q.out
> {noformat}
> explain select table1.id, table1.val, table1.val1 from table1 left semi join 
> table3 on table1.dimid = table3.id and table3.id = 100 where table1.dimid  = 
> 100;
> {noformat}
> Plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: table1
> Statistics: Num rows: 10 Data size: 200 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (((dimid = 100) = true) and (dimid = 100)) (type: 
> boolean)
>   Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), val (type: string), val1 (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: 100 (type: int), true (type: boolean)
>   sort order: ++
>   Map-reduce partition columns: 100 (type: int), true (type: 
> boolean)
>   Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>   value expressions: _col0 (type: int), _col1 (type: string), 
> _col2 (type: string)
>   TableScan
> alias: table3
> Statistics: Num rows: 5 Data size: 15 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (((id = 100) = true) and (id = 100)) (type: boolean)
>   Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: 100 (type: int), true (type: boolean)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator
>   keys: _col0 (type: int), _col1 (type: boolean)
>   mode: hash
>   outputColumnNames: _col0, _col1
>   Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: _col0 (type: int), _col1 (type: boolean)
> sort order: ++
> Map-reduce partition columns: _col0 (type: int), _col1 
> (type: boolean)
> Statistics: Num rows: 1 Data size: 3 Basic stats: 
> COMPLETE Column stats: NONE
>   Reduce Operator Tree:
> Join Operator
>   condition map:
>Left Semi Join 0 to 1
>   keys:
> 0 100 (type: int), true (type: boolean)
> 1 _col0 (type: int), _col1 (type: boolean)
>   outputColumnNames: _col0, _col1, _col2
>   Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column 
> stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE 
> Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> ListSink
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15717) Class "org.apache.hive.beeline.Rows.Row" constructor is CPU consuming due to exception handling

2017-01-31 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847334#comment-15847334
 ] 

Thejas M Nair commented on HIVE-15717:
--

+1

> Class "org.apache.hive.beeline.Rows.Row" constructor is CPU consuming due to 
> exception handling
> ---
>
> Key: HIVE-15717
> URL: https://issues.apache.org/jira/browse/HIVE-15717
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-15717.1.patch, HIVE-15717.2.patch, Screen Shot 
> 2017-01-24 at 3.11.09 PM.png, Screen Shot 2017-01-24 at 3.15.14 PM.png
>
>
> The exception handling from the 3 methods calls (rowDeleted, rowInserted and 
> rowUpdated). The implementation of these methods in 
> org.apache.hive.jdbc.HiveBaseResultSet class is just throwing 
> SQLException("Method not supported”), i.e. no real implementations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15743) vectorized text parsing: speed up double parse

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847333#comment-15847333
 ] 

Sergey Shelukhin commented on HIVE-15743:
-

does there have to be a string, as well as toCharArray? The original code has 
bytes as input. String creation is actually more than half of the cost of going 
from byte[] to double (see picture).

> vectorized text parsing: speed up double parse
> --
>
> Key: HIVE-15743
> URL: https://issues.apache.org/jira/browse/HIVE-15743
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-15743.1.patch, HIVE-15743.2.patch, tpch-without.png
>
>
> {noformat}
> Double.parseDouble(
> new String(bytes, fieldStart, fieldLength, 
> StandardCharsets.UTF_8));{noformat}
> This takes ~25% of the query time in some cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15764) Tez session doesn't get released TezSessionPool on query cancellation

2017-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847327#comment-15847327
 ] 

Sergey Shelukhin commented on HIVE-15764:
-

Dunno. Is there still a scenario where session from TezTask would not be 
released otherwise?

> Tez session doesn't get released TezSessionPool on query cancellation
> -
>
> Key: HIVE-15764
> URL: https://issues.apache.org/jira/browse/HIVE-15764
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-15764.1.patch
>
>
> With HiveServer2, tez execution engine, and 
> hive.server2.tez.initialize.default.sessions=true, if a query is cancelled 
> via jdbc, the tez session doesn't get released back to the session pool.
> This can cause the query processing of new queries to get stuck, waiting for 
> a tez session to be available.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-13816) Infer constants directly when we create semijoin

2017-01-31 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847313#comment-15847313
 ] 

Ashutosh Chauhan commented on HIVE-13816:
-

[~rusanu] would you like to take this one?

> Infer constants directly when we create semijoin
> 
>
> Key: HIVE-13816
> URL: https://issues.apache.org/jira/browse/HIVE-13816
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13068.
> When we create a left semijoin, we could infer the constants from the SEL 
> below when we create the GB to remove duplicates on the right hand side.
> Ex. ql/src/test/results/clientpositive/constprog_semijoin.q.out
> {noformat}
> explain select table1.id, table1.val, table1.val1 from table1 left semi join 
> table3 on table1.dimid = table3.id and table3.id = 100 where table1.dimid  = 
> 100;
> {noformat}
> Plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: table1
> Statistics: Num rows: 10 Data size: 200 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (((dimid = 100) = true) and (dimid = 100)) (type: 
> boolean)
>   Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), val (type: string), val1 (type: 
> string)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: 100 (type: int), true (type: boolean)
>   sort order: ++
>   Map-reduce partition columns: 100 (type: int), true (type: 
> boolean)
>   Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>   value expressions: _col0 (type: int), _col1 (type: string), 
> _col2 (type: string)
>   TableScan
> alias: table3
> Statistics: Num rows: 5 Data size: 15 Basic stats: COMPLETE 
> Column stats: NONE
> Filter Operator
>   predicate: (((id = 100) = true) and (id = 100)) (type: boolean)
>   Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: 100 (type: int), true (type: boolean)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
> Group By Operator
>   keys: _col0 (type: int), _col1 (type: boolean)
>   mode: hash
>   outputColumnNames: _col0, _col1
>   Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: _col0 (type: int), _col1 (type: boolean)
> sort order: ++
> Map-reduce partition columns: _col0 (type: int), _col1 
> (type: boolean)
> Statistics: Num rows: 1 Data size: 3 Basic stats: 
> COMPLETE Column stats: NONE
>   Reduce Operator Tree:
> Join Operator
>   condition map:
>Left Semi Join 0 to 1
>   keys:
> 0 100 (type: int), true (type: boolean)
> 1 _col0 (type: int), _col1 (type: boolean)
>   outputColumnNames: _col0, _col1, _col2
>   Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column 
> stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE 
> Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> ListSink
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (HIVE-15764) Tez session doesn't get released TezSessionPool on query cancellation

2017-01-31 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847306#comment-15847306
 ] 

Thejas M Nair commented on HIVE-15764:
--

Verified that changes in HIVE-14111 already fix it for this case in 2.1.1 .
It has a new mechanism for reclaiming the Tez session. 

[~sershe] Do you think this change still makes sense ?



> Tez session doesn't get released TezSessionPool on query cancellation
> -
>
> Key: HIVE-15764
> URL: https://issues.apache.org/jira/browse/HIVE-15764
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-15764.1.patch
>
>
> With HiveServer2, tez execution engine, and 
> hive.server2.tez.initialize.default.sessions=true, if a query is cancelled 
> via jdbc, the tez session doesn't get released back to the session pool.
> This can cause the query processing of new queries to get stuck, waiting for 
> a tez session to be available.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >