date:20180927

[jira] [Commented] (HIVE-20647) HadoopVer was ignored in QTestUtil

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631391#comment-16631391
 ] 

Hive QA commented on HIVE-20647:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941567/HIVE-20647.1.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15006 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation2]
 (batchId=168)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14096/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14096/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14096/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941567 - PreCommit-HIVE-Build

> HadoopVer was ignored in QTestUtil
> --
>
> Key: HIVE-20647
> URL: https://issues.apache.org/jira/browse/HIVE-20647
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20647.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-27 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631363#comment-16631363
 ] 

ASF GitHub Bot commented on HIVE-20627:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/435


> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
>

[jira] [Commented] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-27 Thread Sankar Hariappan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631361#comment-16631361
 ] 

Sankar Hariappan commented on HIVE-20627:
-

01.patch committed to master.
Thanks [~daijy] for the review!

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
>

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-27 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
>

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-27 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Affects Version/s: (was: 3.2.0)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
>

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-27 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Target Version/s: 4.0.0  (was: 4.0.0, 3.2.0)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
>

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-27 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Fix Version/s: 4.0.0

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>

[jira] [Commented] (HIVE-20647) HadoopVer was ignored in QTestUtil

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631357#comment-16631357
 ] 

Hive QA commented on HIVE-20647:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
44s{color} | {color:blue} itests/util in master has 52 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} itests/util: The patch generated 0 new + 56 
unchanged - 4 fixed = 56 total (was 60) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} itests/util generated 0 new + 50 unchanged - 2 fixed 
= 50 total (was 52) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} hive-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14096/dev-support/hive-personality.sh
 |
| git revision | master / 778c47c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14096/yetus/patch-asflicense-problems.txt
 |
| modules | C: itests/util itests/hive-unit U: itests |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14096/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> HadoopVer was ignored in QTestUtil
> --
>
> Key: HIVE-20647
> URL: https://issues.apache.org/jira/browse/HIVE-20647
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20647.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631341#comment-16631341
 ] 

Hive QA commented on HIVE-20627:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941527/HIVE-20627.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15006 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14095/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14095/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14095/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941527 - PreCommit-HIVE-Build

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
>

[jira] [Commented] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631327#comment-16631327
 ] 

Hive QA commented on HIVE-20627:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 2322 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} service in master has 48 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14095/dev-support/hive-personality.sh
 |
| git revision | master / 778c47c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14095/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql service U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14095/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
>

[jira] [Comment Edited] (HIVE-20649) LLAP aware memory manager for Orc writers

2018-09-27 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631320#comment-16631320
 ] 

Prasanth Jayachandran edited comment on HIVE-20649 at 9/28/18 4:02 AM:
---

[~sershe] can you please review? small patch. This patch needs ORC-409 (also 
needs review :)) for it to work correctly (scaling the stripe size). 


was (Author: prasanth_j):
[~sershe] can you please review? small patch. This patch needs ORC-409 for it 
to work correctly (scaling the stripe size). 

> LLAP aware memory manager for Orc writers
> -
>
> Key: HIVE-20649
> URL: https://issues.apache.org/jira/browse/HIVE-20649
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20649.1.patch
>
>
> ORC writer has its own memory manager that assumes memory usage or memory 
> available based on JVM heap (MemoryMX bean). This works on tez container mode 
> execution model but not in LLAP where container sizes (and Xmx) are typically 
> high and there are multiple executors per LLAP daemon. This custom memory 
> manager should be aware of memory bounds per executor. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20649) LLAP aware memory manager for Orc writers

2018-09-27 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631320#comment-16631320
 ] 

Prasanth Jayachandran commented on HIVE-20649:
--

[~sershe] can you please review? small patch. This patch needs ORC-409 for it 
to work correctly (scaling the stripe size). 

> LLAP aware memory manager for Orc writers
> -
>
> Key: HIVE-20649
> URL: https://issues.apache.org/jira/browse/HIVE-20649
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20649.1.patch
>
>
> ORC writer has its own memory manager that assumes memory usage or memory 
> available based on JVM heap (MemoryMX bean). This works on tez container mode 
> execution model but not in LLAP where container sizes (and Xmx) are typically 
> high and there are multiple executors per LLAP daemon. This custom memory 
> manager should be aware of memory bounds per executor. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20649) LLAP aware memory manager for Orc writers

2018-09-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20649:
-
Status: Patch Available  (was: Open)

> LLAP aware memory manager for Orc writers
> -
>
> Key: HIVE-20649
> URL: https://issues.apache.org/jira/browse/HIVE-20649
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20649.1.patch
>
>
> ORC writer has its own memory manager that assumes memory usage or memory 
> available based on JVM heap (MemoryMX bean). This works on tez container mode 
> execution model but not in LLAP where container sizes (and Xmx) are typically 
> high and there are multiple executors per LLAP daemon. This custom memory 
> manager should be aware of memory bounds per executor. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20649) LLAP aware memory manager for Orc writers

2018-09-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-20649:
-
Attachment: HIVE-20649.1.patch

> LLAP aware memory manager for Orc writers
> -
>
> Key: HIVE-20649
> URL: https://issues.apache.org/jira/browse/HIVE-20649
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20649.1.patch
>
>
> ORC writer has its own memory manager that assumes memory usage or memory 
> available based on JVM heap (MemoryMX bean). This works on tez container mode 
> execution model but not in LLAP where container sizes (and Xmx) are typically 
> high and there are multiple executors per LLAP daemon. This custom memory 
> manager should be aware of memory bounds per executor. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20544) TOpenSessionReq logs password and username

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631304#comment-16631304
 ] 

Hive QA commented on HIVE-20544:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941504/HIVE-20544.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15006 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConcurrentLineage (batchId=255)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14094/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14094/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14094/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941504 - PreCommit-HIVE-Build

> TOpenSessionReq logs password and username
> --
>
> Key: HIVE-20544
> URL: https://issues.apache.org/jira/browse/HIVE-20544
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: beginner, patch, security
> Attachments: HIVE-20544.1.patch, HIVE-20544.2.patch, 
> HIVE-20544.3.patch, HIVE-20544.3.patch, HIVE-20544.4.patch, 
> HIVE-20544.4.patch, HIVE-20544.patch, non-solution.patch, 
> working-solution.patch
>
>
> In 
> service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq,
>  if client protocol is unset, validate() and toString() prints both username 
> and password to logs.
> Logging a password is a security risk. We should hide the ***.
> =Edit= (no longer relevant, see comments)
> This issue is tricky since it is caused in a fully generated class. I've been 
> playing around and have found one working solution, butI'd truly appreciate 
> ideas for a more elegant solution or input.
> The problem:
>  TCLIService.thrift is the template for generating all classes in 
> service-rpc. Struct TOpenSessionReq is OpenSession()'s one parameter and is 
> defined thus:
> {noformat}
> struct TOpenSessionReq {
>   1: required TProtocolVersion client_protocol = 
> TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10
>   2: optional string username
>   3: optional string password
>   4: optional map configuration
> }
> {noformat}
> In the generated class TOpenSessionReq.java, client_protocol is checked by a 
> validate() method, which is called quite a few times; if client_protocol is 
> not set, it throws a TProtocolException, passing along a toString(). This 
> toString() gets the names and values of all fields, including username and 
> password.
> Working solution:
>  * Create a separate struct containing only the username and password, and 
> pass it to OpenSession() as a second parameter. Since all fields in the new 
> struct are "optional", the generated validate() is empty – toString() is 
> never used. This involves changing core classes and breaks the "Each function 
> should take exactly one parameter" coding convention (detailed at 
> service-rpc/if/TCLIService.thrift:27).
>  See working-solution.patch.
> What doesn't work:
>  * Making client_protocol optional instead of required. Apparently this will 
> break everything.
>  * Overwriting toString() – TOpenSessionReq is a struct.
>  * Creating two Thrift structs, one struct for required (TRequiredReq) and 
> one for optional (TOptionalReq) fields, and nesting them in struct 
> TOpenSessionReq. This doesn't work because validate() in TOpenSessionReq can 
> call TOptionalReq.toString(), which prints the password to logs. This will 
> happen if TRequiredReq.client_protocol isn't set.
>  See non-solution.patch
>  * Asking Thrift devs to change their code. I wrote them an email but have no 
> expectations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20150) TopNKey pushdown

2018-09-27 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20150:
--
Attachment: HIVE-20150.11.patch

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.10.patch, 
> HIVE-20150.11.patch, HIVE-20150.11.patch, HIVE-20150.2.patch, 
> HIVE-20150.4.patch, HIVE-20150.5.patch, HIVE-20150.6.patch, 
> HIVE-20150.7.patch, HIVE-20150.8.patch, HIVE-20150.9.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20544) TOpenSessionReq logs password and username

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631282#comment-16631282
 ] 

Hive QA commented on HIVE-20544:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14094/dev-support/hive-personality.sh
 |
| git revision | master / 727e4b2 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14094/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14094/yetus/patch-asflicense-problems.txt
 |
| modules | C: service-rpc itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14094/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TOpenSessionReq logs password and username
> --
>
> Key: HIVE-20544
> URL: https://issues.apache.org/jira/browse/HIVE-20544
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: beginner, patch, security
> Attachments: HIVE-20544.1.patch, HIVE-20544.2.patch, 
> HIVE-20544.3.patch, HIVE-20544.3.patch, HIVE-20544.4.patch, 
> HIVE-20544.4.patch, HIVE-20544.patch, non-solution.patch, 
> working-solution.patch
>
>
> In 
> service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq,
>  if client protocol is unset, validate() and toString() prints both username 
> and password to logs.
> Logging a password is a security risk. We should hide the ***.
> =Edit=

[jira] [Updated] (HIVE-20552) Get Schema from LogicalPlan faster

2018-09-27 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20552:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Get Schema from LogicalPlan faster
> --
>
> Key: HIVE-20552
> URL: https://issues.apache.org/jira/browse/HIVE-20552
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20552.1.patch, HIVE-20552.2.patch, 
> HIVE-20552.3.patch
>
>
> To get the schema of a query faster, it currently needs to compile, optimize, 
> and generate a TezPlan, which creates extra overhead when only the 
> LogicalPlan is needed.
> 1. Copy the method \{{HiveMaterializedViewsRegistry.parseQuery}}, making it 
> \{{public static}} and putting it in a utility class. 
> 2. Change the return statement of the method to \{{return 
> analyzer.getResultSchema();}}
> 3. Change the return type of the method to \{{List}}
> 4. Call the new method from \{{GenericUDTFGetSplits.createPlanFragment}} 
> replacing the current code which does this:
> {code}
>  if(num == 0) {
>  //Schema only
>  return new PlanFragment(null, schema, null);
>  }
> {code}
> moving the call earlier in \{{getPlanFragment}} ... right after the HiveConf 
> is created ... bypassing the code that uses \{{HiveTxnManager}} and 
> \{{Driver}}.
> 5. Convert the \{{List}} to 
> \{{org.apache.hadoop.hive.llap.Schema}}.
> 6. return from \{{getPlanFragment}} by returning \{{new PlanFragment(null, 
> schema, null)}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20552) Get Schema from LogicalPlan faster

2018-09-27 Thread Teddy Choi (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631280#comment-16631280
 ] 

Teddy Choi commented on HIVE-20552:
---

[~jcamachorodriguez], thanks! Pushed to master.

> Get Schema from LogicalPlan faster
> --
>
> Key: HIVE-20552
> URL: https://issues.apache.org/jira/browse/HIVE-20552
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20552.1.patch, HIVE-20552.2.patch, 
> HIVE-20552.3.patch
>
>
> To get the schema of a query faster, it currently needs to compile, optimize, 
> and generate a TezPlan, which creates extra overhead when only the 
> LogicalPlan is needed.
> 1. Copy the method \{{HiveMaterializedViewsRegistry.parseQuery}}, making it 
> \{{public static}} and putting it in a utility class. 
> 2. Change the return statement of the method to \{{return 
> analyzer.getResultSchema();}}
> 3. Change the return type of the method to \{{List}}
> 4. Call the new method from \{{GenericUDTFGetSplits.createPlanFragment}} 
> replacing the current code which does this:
> {code}
>  if(num == 0) {
>  //Schema only
>  return new PlanFragment(null, schema, null);
>  }
> {code}
> moving the call earlier in \{{getPlanFragment}} ... right after the HiveConf 
> is created ... bypassing the code that uses \{{HiveTxnManager}} and 
> \{{Driver}}.
> 5. Convert the \{{List}} to 
> \{{org.apache.hadoop.hive.llap.Schema}}.
> 6. return from \{{getPlanFragment}} by returning \{{new PlanFragment(null, 
> schema, null)}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20052) Arrow serde should fill ArrowColumnVector(Decimal) with the given schema precision/scale

2018-09-27 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20052:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Arrow serde should fill ArrowColumnVector(Decimal) with the given schema 
> precision/scale
> 
>
> Key: HIVE-20052
> URL: https://issues.apache.org/jira/browse/HIVE-20052
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20052.1.patch, HIVE-20052.patch
>
>
> Arrow serde should fill ArrowColumnVector with given precision and scale. 
> When it serializes negative values into Arrow, it throws exceptions that the 
> precision of the value is not same with the precision of Arrow decimal vector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20052) Arrow serde should fill ArrowColumnVector(Decimal) with the given schema precision/scale

2018-09-27 Thread Teddy Choi (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631277#comment-16631277
 ] 

Teddy Choi commented on HIVE-20052:
---

TestSSL is not related and passed on my laptop.

> Arrow serde should fill ArrowColumnVector(Decimal) with the given schema 
> precision/scale
> 
>
> Key: HIVE-20052
> URL: https://issues.apache.org/jira/browse/HIVE-20052
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20052.1.patch, HIVE-20052.patch
>
>
> Arrow serde should fill ArrowColumnVector with given precision and scale. 
> When it serializes negative values into Arrow, it throws exceptions that the 
> precision of the value is not same with the precision of Arrow decimal vector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20563) Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type are different

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631273#comment-16631273
 ] 

Hive QA commented on HIVE-20563:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941495/HIVE-20563.03.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 15006 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[load_dyn_part3]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_case_when_conversion]
 (batchId=158)
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testIfConditionalExprs
 (batchId=301)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=252)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14092/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14092/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14092/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941495 - PreCommit-HIVE-Build

> Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type 
> are different
> ---
>
> Key: HIVE-20563
> URL: https://issues.apache.org/jira/browse/HIVE-20563
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-20563.01.patch, HIVE-20563.02.patch, 
> HIVE-20563.03.patch
>
>
> With the following stacktrace:
> {code}
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
> ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) 
> [hadoop-mapreduce-client-common-3.1.0.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>  ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_181]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_181]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_181]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_181]
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:973)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
>

[jira] [Commented] (HIVE-20052) Arrow serde should fill ArrowColumnVector(Decimal) with the given schema precision/scale

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631276#comment-16631276
 ] 

Hive QA commented on HIVE-20052:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941494/HIVE-20052.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14093/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14093/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14093/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12941494/HIVE-20052.1.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941494 - PreCommit-HIVE-Build

> Arrow serde should fill ArrowColumnVector(Decimal) with the given schema 
> precision/scale
> 
>
> Key: HIVE-20052
> URL: https://issues.apache.org/jira/browse/HIVE-20052
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20052.1.patch, HIVE-20052.patch
>
>
> Arrow serde should fill ArrowColumnVector with given precision and scale. 
> When it serializes negative values into Arrow, it throws exceptions that the 
> precision of the value is not same with the precision of Arrow decimal vector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20635) VectorizedOrcAcidRowBatchReader doesn't filter delete events for original files

2018-09-27 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631274#comment-16631274
 ] 

Eugene Koifman commented on HIVE-20635:
---

[~saurabhseth] could you take this on?

> VectorizedOrcAcidRowBatchReader doesn't filter delete events for original 
> files
> ---
>
> Key: HIVE-20635
> URL: https://issues.apache.org/jira/browse/HIVE-20635
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Priority: Major
>
> this is a followup to HIVE-16812 which adds support for delete event 
> filtering for splits from native acid files
> need to add the same for {{OrcSplit.isOriginal()}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17917) VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization

2018-09-27 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17917:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

committed to master
thanks Saurabh for the contribution

> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
> ---
>
> Key: HIVE-17917
> URL: https://issues.apache.org/jira/browse/HIVE-17917
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Saurabh Seth
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-17917.2.patch, HIVE-17917.patch
>
>
> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization() 
> computation is currently (after HIVE-17458) is done once per split.  It could 
> instead be done once per file (since the result is the same for each split of 
> the same file) and passed along in OrcSplit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17917) VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization

2018-09-27 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631259#comment-16631259
 ] 

Eugene Koifman commented on HIVE-17917:
---

+1

> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
> ---
>
> Key: HIVE-17917
> URL: https://issues.apache.org/jira/browse/HIVE-17917
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Saurabh Seth
>Priority: Minor
> Attachments: HIVE-17917.2.patch, HIVE-17917.patch
>
>
> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization() 
> computation is currently (after HIVE-17458) is done once per split.  It could 
> instead be done once per file (since the result is the same for each split of 
> the same file) and passed along in OrcSplit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20563) Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type are different

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631246#comment-16631246
 ] 

Hive QA commented on HIVE-20563:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
0s{color} | {color:blue} ql in master has 2324 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14092/dev-support/hive-personality.sh
 |
| git revision | master / 37fd22e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14092/yetus/patch-asflicense-problems.txt
 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14092/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type 
> are different
> ---
>
> Key: HIVE-20563
> URL: https://issues.apache.org/jira/browse/HIVE-20563
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-20563.01.patch, HIVE-20563.02.patch, 
> HIVE-20563.03.patch
>
>
> With the following stacktrace:
> {code}
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
> ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) 
> [hadoop-mapreduce-client-common-3.1.0.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>

[jira] [Updated] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17043:
---
Status: Open  (was: Patch Available)

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17043:
---
Status: Patch Available  (was: Open)

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17043) Remove non unique columns from group by keys if not referenced later

2018-09-27 Thread Vineet Garg (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17043:
---
Attachment: HIVE-17043.7.patch

> Remove non unique columns from group by keys if not referenced later
> 
>
> Key: HIVE-17043
> URL: https://issues.apache.org/jira/browse/HIVE-17043
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17043.1.patch, HIVE-17043.2.patch, 
> HIVE-17043.3.patch, HIVE-17043.4.patch, HIVE-17043.5.patch, 
> HIVE-17043.6.patch, HIVE-17043.7.patch
>
>
> Group by keys may be a mix of unique (or primary) keys and regular columns. 
> In such cases presence of regular column won't alter cardinality of groups. 
> So, if regular columns are not referenced later, they can be dropped from 
> group by keys. Depending on operator tree may result in those columns not 
> being read at all from disk in best case. In worst case, we will avoid 
> shuffling and sorting regular columns from mapper to reducer, which still 
> could be substantial CPU and network savings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20052) Arrow serde should fill ArrowColumnVector(Decimal) with the given schema precision/scale

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631230#comment-16631230
 ] 

Hive QA commented on HIVE-20052:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941494/HIVE-20052.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15006 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testMetastoreWithSSL (batchId=251)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14091/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14091/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14091/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941494 - PreCommit-HIVE-Build

> Arrow serde should fill ArrowColumnVector(Decimal) with the given schema 
> precision/scale
> 
>
> Key: HIVE-20052
> URL: https://issues.apache.org/jira/browse/HIVE-20052
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20052.1.patch, HIVE-20052.patch
>
>
> Arrow serde should fill ArrowColumnVector with given precision and scale. 
> When it serializes negative values into Arrow, it throws exceptions that the 
> precision of the value is not same with the precision of Arrow decimal vector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20623) Shared work: Extend sharing of map-join cache entries in LLAP

2018-09-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20623:
---
Attachment: HIVE-20623.03.patch

> Shared work: Extend sharing of map-join cache entries in LLAP
> -
>
> Key: HIVE-20623
> URL: https://issues.apache.org/jira/browse/HIVE-20623
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Logical Optimizer
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20623.01.patch, HIVE-20623.02.patch, 
> HIVE-20623.02.patch, HIVE-20623.02.patch, HIVE-20623.03.patch, 
> HIVE-20623.patch, hash-shared-work.json.txt, hash-shared-work.svg
>
>
> For a query like this
> {code}
> with all_sales as (
> select ss_customer_sk as customer_sk, ss_ext_list_price-ss_ext_discount_amt 
> as ext_price from store_sales
> UNION ALL
> select ws_bill_customer_sk as customer_sk, 
> ws_ext_list_price-ws_ext_discount_amt as ext_price from web_sales
> UNION ALL
> select cs_bill_customer_sk as customer_sk, cs_ext_sales_price - 
> cs_ext_discount_amt as ext_price from catalog_sales)
> select sum(ext_price) total_price, c_customer_id from all_sales, customer 
> where customer_sk = c_customer_sk
> group by c_customer_id
> order by total_price desc 
> limit 100;
> {code}
> The hashtable used for all 3 joins are identical, which is loaded 3x times in 
> the same LLAP instance because they are named.
> {code}
> cacheKey = "HASH_MAP_" + this.getOperatorId() + "_container";
> {code}
> in the cache.
> If those are identical in nature (i.e vectorization, hashtable type etc), 
> then the duplication is just wasted CPU, memory and network - using the cache 
> name for hashtables which will be identical in layout would be extremely 
> useful.
> In cases where the join is pushed through a UNION, those are identical.
> This optimization can only be done without concern for accidental delays when 
> the same upstream task is generating all of these hashtables, which is what 
> is achieved by the shared scan optimizer already.
> In case the shared work is not present, this has potential downsides - in 
> case two customer broadcasts were sourced from "Map 1" and "Map 2", the Map 1 
> builder will block the other task from reading from Map 2, even though Map 2 
> might have started after, but finished ahead of Map 1.
> So this specific optimization can always be considered for cases where the 
> shared work unifies the operator tree and the parents of all the RS entries 
> involved are same (& the RS layout is the same).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20052) Arrow serde should fill ArrowColumnVector(Decimal) with the given schema precision/scale

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631210#comment-16631210
 ] 

Hive QA commented on HIVE-20052:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 2324 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 7 new + 268 unchanged - 3 
fixed = 275 total (was 271) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14091/dev-support/hive-personality.sh
 |
| git revision | master / 37fd22e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14091/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14091/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14091/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Arrow serde should fill ArrowColumnVector(Decimal) with the given schema 
> precision/scale
> 
>
> Key: HIVE-20052
> URL: https://issues.apache.org/jira/browse/HIVE-20052
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20052.1.patch, HIVE-20052.patch
>
>
> Arrow serde should fill ArrowColumnVector with given precision and scale. 
> When it serializes negative values into Arrow, it throws exceptions that the 
> precision of the value is not same with the precision of Arrow decimal vector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20609) Create SSD cache dir if it doesnt exist already

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631197#comment-16631197
 ] 

Hive QA commented on HIVE-20609:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941488/HIVE-20609.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15008 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testTokenAuth 
(batchId=266)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14090/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14090/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14090/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941488 - PreCommit-HIVE-Build

> Create SSD cache dir if it doesnt exist already
> ---
>
> Key: HIVE-20609
> URL: https://issues.apache.org/jira/browse/HIVE-20609
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HIVE-20609.01.patch, HIVE-20609.02.patch, 
> HIVE-20609.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20291) Allow HiveStreamingConnection to receive a WriteId

2018-09-27 Thread Jaume M (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20291:
---
Attachment: HIVE-20291.7.patch
Status: Patch Available  (was: Open)

> Allow HiveStreamingConnection to receive a WriteId
> --
>
> Key: HIVE-20291
> URL: https://issues.apache.org/jira/browse/HIVE-20291
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20291.1.patch, HIVE-20291.2.patch, 
> HIVE-20291.3.patch, HIVE-20291.4.patch, HIVE-20291.5.patch, 
> HIVE-20291.6.patch, HIVE-20291.7.patch
>
>
> If the writeId is received externally it won't need to open connections to 
> the metastore. It won't be able to the commit in this case as well so it must 
> be done by the entity passing the writeId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20291) Allow HiveStreamingConnection to receive a WriteId

2018-09-27 Thread Jaume M (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaume M updated HIVE-20291:
---
Status: Open  (was: Patch Available)

> Allow HiveStreamingConnection to receive a WriteId
> --
>
> Key: HIVE-20291
> URL: https://issues.apache.org/jira/browse/HIVE-20291
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20291.1.patch, HIVE-20291.2.patch, 
> HIVE-20291.3.patch, HIVE-20291.4.patch, HIVE-20291.5.patch, HIVE-20291.6.patch
>
>
> If the writeId is received externally it won't need to open connections to 
> the metastore. It won't be able to the commit in this case as well so it must 
> be done by the entity passing the writeId.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-20634) DirectSQL does not retry in ORM mode while getting partitions by filter

2018-09-27 Thread Karthik Manamcheri (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Manamcheri resolved HIVE-20634.
---
Resolution: Not A Problem

I realized that this was not a bug. 
{{directSql.generateSqlFilterForPushdown(..)}} never calls SQL and only 
generates the filter. Closing as Not A Problem.

> DirectSQL does not retry in ORM mode while getting partitions by filter
> ---
>
> Key: HIVE-20634
> URL: https://issues.apache.org/jira/browse/HIVE-20634
> Project: Hive
>  Issue Type: Bug
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Major
>
> The code path for getting partitions by filter is as follows,
> {code:java}
>   protected List getPartitionsByFilterInternal(..) {
>...
>   @Override
>   protected boolean canUseDirectSql(GetHelper> ctx) 
> throws MetaException 
>  {
> return directSql.generateSqlFilterForPushdown(ctx.getTable(), tree, 
> filter);
>   }
>...
>   }
> {code}
> If directSql.generateSqlFilterForPushdown throws an exception, we should be 
> returning false from canUseDirectSql instead of propagating the exception. 
> The propagation of exception causes the whole query to fail, instead of 
> retrying with JDO.
> We should have code such as
> {code:java}
>   @Override
>   protected boolean canUseDirectSql(GetHelper ctx) throws 
> MetaException {
> try {
>   return directSql.generateSqlFilterForPushdown(ctx.getTable(), 
> exprTree, filter);
> } catch (final MetaException me) {
>   return false;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20609) Create SSD cache dir if it doesnt exist already

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631155#comment-16631155
 ] 

Hive QA commented on HIVE-20609:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} llap-server in master has 84 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14090/dev-support/hive-personality.sh
 |
| git revision | master / 37fd22e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14090/yetus/patch-asflicense-problems.txt
 |
| modules | C: llap-server U: llap-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14090/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Create SSD cache dir if it doesnt exist already
> ---
>
> Key: HIVE-20609
> URL: https://issues.apache.org/jira/browse/HIVE-20609
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HIVE-20609.01.patch, HIVE-20609.02.patch, 
> HIVE-20609.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-27 Thread denys kuzmenko (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

denys kuzmenko updated HIVE-20535:
--
Attachment: HIVE-20535.18.patch

> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.16.patch, 
> HIVE-20535.17.patch, HIVE-20535.18.patch, HIVE-20535.2.patch, 
> HIVE-20535.3.patch, HIVE-20535.4.patch, HIVE-20535.5.patch, 
> HIVE-20535.6.patch, HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19584) Dictionary encoding for string types

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631138#comment-16631138
 ] 

Hive QA commented on HIVE-19584:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941480/HIVE-19584.9.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14089/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14089/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14089/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12941480/HIVE-19584.9.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941480 - PreCommit-HIVE-Build

> Dictionary encoding for string types
> 
>
> Key: HIVE-19584
> URL: https://issues.apache.org/jira/browse/HIVE-19584
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19584.1.patch, HIVE-19584.2.patch, 
> HIVE-19584.3.patch, HIVE-19584.4.patch, HIVE-19584.5.patch, 
> HIVE-19584.6.patch, HIVE-19584.7.patch, HIVE-19584.8.patch, HIVE-19584.9.patch
>
>
> Apache Arrow supports dictionary encoding for some data types. So implement 
> dictionary encoding for string types in Arrow SerDe.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19584) Dictionary encoding for string types

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631137#comment-16631137
 ] 

Hive QA commented on HIVE-19584:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941480/HIVE-19584.9.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15007 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testLlapInputFormatEndToEnd
 (batchId=252)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14088/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14088/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14088/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941480 - PreCommit-HIVE-Build

> Dictionary encoding for string types
> 
>
> Key: HIVE-19584
> URL: https://issues.apache.org/jira/browse/HIVE-19584
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19584.1.patch, HIVE-19584.2.patch, 
> HIVE-19584.3.patch, HIVE-19584.4.patch, HIVE-19584.5.patch, 
> HIVE-19584.6.patch, HIVE-19584.7.patch, HIVE-19584.8.patch, HIVE-19584.9.patch
>
>
> Apache Arrow supports dictionary encoding for some data types. So implement 
> dictionary encoding for string types in Arrow SerDe.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20552) Get Schema from LogicalPlan faster

2018-09-27 Thread Jesus Camacho Rodriguez (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631126#comment-16631126
 ] 

Jesus Camacho Rodriguez commented on HIVE-20552:


+1

> Get Schema from LogicalPlan faster
> --
>
> Key: HIVE-20552
> URL: https://issues.apache.org/jira/browse/HIVE-20552
> Project: Hive
>  Issue Type: Improvement
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20552.1.patch, HIVE-20552.2.patch, 
> HIVE-20552.3.patch
>
>
> To get the schema of a query faster, it currently needs to compile, optimize, 
> and generate a TezPlan, which creates extra overhead when only the 
> LogicalPlan is needed.
> 1. Copy the method \{{HiveMaterializedViewsRegistry.parseQuery}}, making it 
> \{{public static}} and putting it in a utility class. 
> 2. Change the return statement of the method to \{{return 
> analyzer.getResultSchema();}}
> 3. Change the return type of the method to \{{List}}
> 4. Call the new method from \{{GenericUDTFGetSplits.createPlanFragment}} 
> replacing the current code which does this:
> {code}
>  if(num == 0) {
>  //Schema only
>  return new PlanFragment(null, schema, null);
>  }
> {code}
> moving the call earlier in \{{getPlanFragment}} ... right after the HiveConf 
> is created ... bypassing the code that uses \{{HiveTxnManager}} and 
> \{{Driver}}.
> 5. Convert the \{{List}} to 
> \{{org.apache.hadoop.hive.llap.Schema}}.
> 6. return from \{{getPlanFragment}} by returning \{{new PlanFragment(null, 
> schema, null)}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19584) Dictionary encoding for string types

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631096#comment-16631096
 ] 

Hive QA commented on HIVE-19584:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
28s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
54s{color} | {color:blue} ql in master has 2324 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 110 new + 376 unchanged - 162 
fixed = 486 total (was 538) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14088/dev-support/hive-personality.sh
 |
| git revision | master / 37fd22e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14088/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14088/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14088/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Dictionary encoding for string types
> 
>
> Key: HIVE-19584
> URL: https://issues.apache.org/jira/browse/HIVE-19584
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19584.1.patch, HIVE-19584.2.patch, 
> HIVE-19584.3.patch, HIVE-19584.4.patch, HIVE-19584.5.patch, 
> HIVE-19584.6.patch, HIVE-19584.7.patch, HIVE-19584.8.patch, HIVE-19584.9.patch
>
>
> Apache Arrow supports dictionary encoding for some data types. So implement 
> dictionary encoding for string types in Arrow SerDe.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20629) Hive incremental replication fails with events missing error if database is kept idle for more than an hour

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631064#comment-16631064
 ] 

Hive QA commented on HIVE-20629:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941417/HIVE-20629.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15000 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testComplexQuery (batchId=252)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14087/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14087/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14087/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941417 - PreCommit-HIVE-Build

> Hive incremental replication fails with events missing error if database is 
> kept idle for more than an hour 
> 
>
> Key: HIVE-20629
> URL: https://issues.apache.org/jira/browse/HIVE-20629
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20629.01.patch, HIVE-20629.02.patch, 
> HIVE-20629.03.patch
>
>
> Start a source cluster with 2 database. Replicate the databases to target 
> after doing some operations. Keep taking incremental dump for both database 
> and keep replicating them to target cluster. Keep one the database idle for 
> more than 24 hrs. After 24 hrs, the incremental dump of idle database fails 
> with event missing error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12254) Improve logging with yarn/hdfs

2018-09-27 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-12254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631037#comment-16631037
 ] 

Aihua Xu commented on HIVE-12254:
-

Thanks [~sershe] 

> Improve logging with yarn/hdfs
> --
>
> Key: HIVE-12254
> URL: https://issues.apache.org/jira/browse/HIVE-12254
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-12254.1.patch, HIVE-12254.2.patch
>
>
> In extension to HIVE-12249, adding info for Yarn/HDFS as well. Both 
> HIVE-12249 and HDFS-9184 are required (and upgraded in hive for the HDFS 
> issue) before this can be resolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-12254) Improve logging with yarn/hdfs

2018-09-27 Thread Aihua Xu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-12254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-12254:
---

Assignee: Aihua Xu  (was: Vikram Dixit K)

> Improve logging with yarn/hdfs
> --
>
> Key: HIVE-12254
> URL: https://issues.apache.org/jira/browse/HIVE-12254
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-12254.1.patch, HIVE-12254.2.patch
>
>
> In extension to HIVE-12249, adding info for Yarn/HDFS as well. Both 
> HIVE-12249 and HDFS-9184 are required (and upgraded in hive for the HDFS 
> issue) before this can be resolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20629) Hive incremental replication fails with events missing error if database is kept idle for more than an hour

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631025#comment-16631025
 ] 

Hive QA commented on HIVE-20629:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
39s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
0s{color} | {color:blue} ql in master has 2325 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 3 new + 48 unchanged - 2 fixed 
= 51 total (was 50) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m  
1s{color} | {color:red} ql generated 1 new + 2324 unchanged - 1 fixed = 2325 
total (was 2325) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Write to static field 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.numIteration
 from instance method new 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder(String,
 String, String, IncrementalLoadEventsIterator, HiveConf, Long)  At 
IncrementalLoadTasksBuilder.java:from instance method new 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder(String,
 String, String, IncrementalLoadEventsIterator, HiveConf, Long)  At 
IncrementalLoadTasksBuilder.java:[line 85] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14087/dev-support/hive-personality.sh
 |
| git revision | master / 1ab23e3 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14087/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14087/yetus/new-findbugs-ql.html
 |
| modules | C: itests/hive-unit ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14087/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hive incremental replication fails with events missing error if database is 
> kept idle for more than an hour 
> 
>
> Key: HIVE-20629
> URL: https://issues.apache.org/jira/browse/HIVE-20629
>

[jira] [Commented] (HIVE-12254) Improve logging with yarn/hdfs

2018-09-27 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-12254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631026#comment-16631026
 ] 

Sergey Shelukhin commented on HIVE-12254:
-

Nobody is currently looking at it. Feel free to take over.

> Improve logging with yarn/hdfs
> --
>
> Key: HIVE-12254
> URL: https://issues.apache.org/jira/browse/HIVE-12254
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>Priority: Major
> Attachments: HIVE-12254.1.patch, HIVE-12254.2.patch
>
>
> In extension to HIVE-12249, adding info for Yarn/HDFS as well. Both 
> HIVE-12249 and HDFS-9184 are required (and upgraded in hive for the HDFS 
> issue) before this can be resolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12254) Improve logging with yarn/hdfs

2018-09-27 Thread Aihua Xu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-12254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631021#comment-16631021
 ] 

Aihua Xu commented on HIVE-12254:
-

[~vikram.dixit] and [~sershe] Do you plan to include this change? It's a nice 
feature to allow analyzing HDFS access for hive. I can drive this if you are 
not actively working on it. Thanks.

> Improve logging with yarn/hdfs
> --
>
> Key: HIVE-12254
> URL: https://issues.apache.org/jira/browse/HIVE-12254
> Project: Hive
>  Issue Type: Bug
>  Components: Shims
>Affects Versions: 1.2.1
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>Priority: Major
> Attachments: HIVE-12254.1.patch, HIVE-12254.2.patch
>
>
> In extension to HIVE-12249, adding info for Yarn/HDFS as well. Both 
> HIVE-12249 and HDFS-9184 are required (and upgraded in hive for the HDFS 
> issue) before this can be resolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-16812.
---
  Resolution: Fixed
Release Note: n/a

committed to master
thanks Sergey for the review

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 4.0.0
>
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch, HIVE-16812.07.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20649) LLAP aware memory manager for Orc writers

2018-09-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20649:



> LLAP aware memory manager for Orc writers
> -
>
> Key: HIVE-20649
> URL: https://issues.apache.org/jira/browse/HIVE-20649
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> ORC writer has its own memory manager that assumes memory usage or memory 
> available based on JVM heap (MemoryMX bean). This works on tez container mode 
> execution model but not in LLAP where container sizes (and Xmx) are typically 
> high and there are multiple executors per LLAP daemon. This custom memory 
> manager should be aware of memory bounds per executor. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20150) TopNKey pushdown

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630981#comment-16630981
 ] 

Hive QA commented on HIVE-20150:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941468/HIVE-20150.11.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 14997 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hadoop.hive.ql.exec.spark.TestSparkSessionTimeout.testMultiSessionSparkSessionTimeout
 (batchId=246)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testComplexQuery (batchId=252)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14086/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14086/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14086/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941468 - PreCommit-HIVE-Build

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.10.patch, 
> HIVE-20150.11.patch, HIVE-20150.2.patch, HIVE-20150.4.patch, 
> HIVE-20150.5.patch, HIVE-20150.6.patch, HIVE-20150.7.patch, 
> HIVE-20150.8.patch, HIVE-20150.9.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18876) Remove Superfluous Logging in Driver

2018-09-27 Thread Alice Fan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630976#comment-16630976
 ] 

Alice Fan commented on HIVE-18876:
--

Hi [~ngangam], This one can be committed as well. Thank you!

> Remove Superfluous Logging in Driver
> 
>
> Key: HIVE-18876
> URL: https://issues.apache.org/jira/browse/HIVE-18876
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Trivial
>  Labels: noob
> Attachments: HIVE-18876.1.patch, HIVE-18876.2.patch
>
>
> [https://github.com/apache/hive/blob/a4198f584aa0792a16d1e1eeb2ef3147403b8acb/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L2188-L2190]
> {code:java}
> if (console != null) {
>   console.printInfo("OK");
> }
> {code}
>  # Console can never be 'null'
>  # OK is not an informative logging message, and in the HiveServer2 logs, it 
> is often interwoven into the logging and is pretty much useless on its own, 
> without having to trace back through the logs to see what happened before it. 
>  This is also printed out, even if an error occurred
> Please remove this block of code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20648) LLAP: Vector group by operator should use memory per executor

2018-09-27 Thread Prasanth Jayachandran (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-20648:



> LLAP: Vector group by operator should use memory per executor
> -
>
> Key: HIVE-20648
> URL: https://issues.apache.org/jira/browse/HIVE-20648
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> HIVE-15503 treatment has to be applied for vector group by operator as well. 
> Vector group by currently uses MemoryMX bean to get heap usage and heap max 
> memory which will not work for LLAP. Instead it should use memory per 
> executor as upper bound to make flush decision.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630974#comment-16630974
 ] 

Eugene Koifman commented on HIVE-16812:
---

patch 7 fixes the typo

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 4.0.0
>
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch, HIVE-16812.07.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16812:
--
Fix Version/s: 4.0.0
   Status: Open  (was: Patch Available)

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 4.0.0
>
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch, HIVE-16812.07.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16812:
--
Attachment: HIVE-16812.07.patch

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 4.0.0
>
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch, HIVE-16812.07.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-27 Thread Naveen Gangam (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-19302:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Fix has been pushed to master. Thank you for your contribution [~afan]. Closing 
the jira.

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> HIVE-19302.6.patch, HIVE-19302.7.patch, table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630955#comment-16630955
 ] 

Eugene Koifman commented on HIVE-16812:
---

I think because we normally drop the top level struct from the data file, but 
here since from ORC directly it's still there and accounted for

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20150) TopNKey pushdown

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630942#comment-16630942
 ] 

Hive QA commented on HIVE-20150:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
54s{color} | {color:blue} ql in master has 2325 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 51 new + 38 unchanged - 0 
fixed = 89 total (was 38) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14086/dev-support/hive-personality.sh
 |
| git revision | master / 00dc4c7 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14086/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14086/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.10.patch, 
> HIVE-20150.11.patch, HIVE-20150.2.patch, HIVE-20150.4.patch, 
> HIVE-20150.5.patch, HIVE-20150.6.patch, HIVE-20150.7.patch, 
> HIVE-20150.8.patch, HIVE-20150.9.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630936#comment-16630936
 ] 

Sergey Shelukhin commented on HIVE-16812:
-

+1 
There's a typo in stripe boundary detection somewhere, stipeindex

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630937#comment-16630937
 ] 

Sergey Shelukhin commented on HIVE-16812:
-

I had one other question on the RB, just double checking why +1 is needed. I 
think other code uses these constants without it

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19985) ACID: Skip decoding the ROW__ID sections for read-only queries

2018-09-27 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630916#comment-16630916
 ] 

Eugene Koifman commented on HIVE-19985:
---

Hive is on ORC 1.5.3 now but this patch needs to be rebased

> ACID: Skip decoding the ROW__ID sections for read-only queries 
> ---
>
> Key: HIVE-19985
> URL: https://issues.apache.org/jira/browse/HIVE-19985
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Eugene Koifman
>Priority: Major
>  Labels: Branch3Candidate
> Attachments: HIVE-19985.01.patch, HIVE-19985.04.patch, 
> HIVE-19985.05.patch
>
>
> For a base_n file there are no aborted transactions within the file and if 
> there are no pending delete deltas, the entire ACID ROW__ID can be skipped 
> for all read-only queries (i.e SELECT), though it still needs to be projected 
> out for MERGE, UPDATE and DELETE queries.
> This patch tries to entirely ignore the ACID ROW__ID fields for all tables 
> where there are no possible deletes or aborted transactions for an ACID split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-20523) Improve table statistics for Parquet format

2018-09-27 Thread George Pachitariu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630911#comment-16630911
 ] 

George Pachitariu edited comment on HIVE-20523 at 9/27/18 7:04 PM:
---

Hi [~kgyrtkirk] , can you please have a look at this patch?

I have looked at the failed tests. They all failed because the expected raw 
data size has changed, which is the expected behaviour of this patch.


was (Author: george.pachitariu):
Hi Zoltan Haindrich, can you please have a look at this patch?

I have looked at the failed tests. They all failed because the expected raw 
data size has changed, which is the expected behaviour of this patch.

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, 
> HIVE-20523.6.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630909#comment-16630909
 ] 

Eugene Koifman commented on HIVE-16812:
---

No related failures

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20523) Improve table statistics for Parquet format

2018-09-27 Thread George Pachitariu (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630911#comment-16630911
 ] 

George Pachitariu commented on HIVE-20523:
--

Hi Zoltan Haindrich, can you please have a look at this patch?

I have looked at the failed tests. They all failed because the expected raw 
data size has changed, which is the expected behaviour of this patch.

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, 
> HIVE-20523.6.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630908#comment-16630908
 ] 

Hive QA commented on HIVE-16812:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941456/HIVE-16812.06.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15005 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout (batchId=319)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14085/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14085/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14085/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941456 - PreCommit-HIVE-Build

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20640) Upgrade Hive to use ORC 1.5.3

2018-09-27 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630887#comment-16630887
 ] 

Eugene Koifman commented on HIVE-20640:
---

looks like spurious failure - passes locally
committed to master
thanks Gopal for the review

> Upgrade Hive to use ORC 1.5.3
> -
>
> Key: HIVE-20640
> URL: https://issues.apache.org/jira/browse/HIVE-20640
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20640.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20640) Upgrade Hive to use ORC 1.5.3

2018-09-27 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20640:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Upgrade Hive to use ORC 1.5.3
> -
>
> Key: HIVE-20640
> URL: https://issues.apache.org/jira/browse/HIVE-20640
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20640.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20619) Include MultiDelimitSerDe in HIveServer2 By Default

2018-09-27 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-20619:
-
Attachment: HIVE-20619.2.patch

> Include MultiDelimitSerDe in HIveServer2 By Default
> ---
>
> Key: HIVE-20619
> URL: https://issues.apache.org/jira/browse/HIVE-20619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-20619.2.patch
>
>
> In [HIVE-20020], the hive-contrib JAR file was removed from the HiveServer2 
> classpath.  With this change, the {{MultiDelimitSerDe}} is no longer 
> included.  This is fine, because {{MultiDelimitSerDe}} was a pain in that 
> environment anyway.  It was available to HiveServer2, and therefore would 
> work with a limited set of queries (select * from table limit 1) but any 
> other query on that table which launched a MapReduce project would fail 
> because the hive-contrib JAR file was not sent out with the rest of the Hive 
> JARs for MapReduce jobs.
> Please bring {{MultiDelimitSerDe}} back into the fold so that it's available 
> to users out of the box without having to install the hive-contrib JAR into 
> the HiveServer2 auxiliary directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20619) Include MultiDelimitSerDe in HIveServer2 By Default

2018-09-27 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-20619:
-
Attachment: (was: HIVE-20619.2.patch)

> Include MultiDelimitSerDe in HIveServer2 By Default
> ---
>
> Key: HIVE-20619
> URL: https://issues.apache.org/jira/browse/HIVE-20619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-20619.2.patch
>
>
> In [HIVE-20020], the hive-contrib JAR file was removed from the HiveServer2 
> classpath.  With this change, the {{MultiDelimitSerDe}} is no longer 
> included.  This is fine, because {{MultiDelimitSerDe}} was a pain in that 
> environment anyway.  It was available to HiveServer2, and therefore would 
> work with a limited set of queries (select * from table limit 1) but any 
> other query on that table which launched a MapReduce project would fail 
> because the hive-contrib JAR file was not sent out with the rest of the Hive 
> JARs for MapReduce jobs.
> Please bring {{MultiDelimitSerDe}} back into the fold so that it's available 
> to users out of the box without having to install the hive-contrib JAR into 
> the HiveServer2 auxiliary directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20619) Include MultiDelimitSerDe in HIveServer2 By Default

2018-09-27 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-20619:
-
Attachment: (was: HIVE-20619.1.patch)

> Include MultiDelimitSerDe in HIveServer2 By Default
> ---
>
> Key: HIVE-20619
> URL: https://issues.apache.org/jira/browse/HIVE-20619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-20619.2.patch
>
>
> In [HIVE-20020], the hive-contrib JAR file was removed from the HiveServer2 
> classpath.  With this change, the {{MultiDelimitSerDe}} is no longer 
> included.  This is fine, because {{MultiDelimitSerDe}} was a pain in that 
> environment anyway.  It was available to HiveServer2, and therefore would 
> work with a limited set of queries (select * from table limit 1) but any 
> other query on that table which launched a MapReduce project would fail 
> because the hive-contrib JAR file was not sent out with the rest of the Hive 
> JARs for MapReduce jobs.
> Please bring {{MultiDelimitSerDe}} back into the fold so that it's available 
> to users out of the box without having to install the hive-contrib JAR into 
> the HiveServer2 auxiliary directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630854#comment-16630854
 ] 

Hive QA commented on HIVE-16812:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
34s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
56s{color} | {color:blue} ql in master has 2325 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
52s{color} | {color:red} ql: The patch generated 34 new + 1455 unchanged - 10 
fixed = 1489 total (was 1465) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m  
9s{color} | {color:red} ql generated 1 new + 2323 unchanged - 2 fixed = 2324 
total (was 2325) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 39s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Redundant nullcheck of keyIndex, which is known to be non-null in 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.findMinMaxKeys(OrcSplit,
 Configuration, Reader$Options)  Redundant null check at 
VectorizedOrcAcidRowBatchReader.java:is known to be non-null in 
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.findMinMaxKeys(OrcSplit,
 Configuration, Reader$Options)  Redundant null check at 
VectorizedOrcAcidRowBatchReader.java:[line 394] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14085/dev-support/hive-personality.sh
 |
| git revision | master / fb7291a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14085/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14085/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14085/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14085/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components:

[jira] [Updated] (HIVE-18778) Needs to capture input/output entities in explain

2018-09-27 Thread Daniel Dai (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-18778:
--
Attachment: HIVE-18778.11.branch-3.1.patch

> Needs to capture input/output entities in explain
> -
>
> Key: HIVE-18778
> URL: https://issues.apache.org/jira/browse/HIVE-18778
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-18778-SparkPositive.patch, HIVE-18778.1.patch, 
> HIVE-18778.10.branch-3.patch, HIVE-18778.11.branch-3.1.patch, 
> HIVE-18778.11.branch-3.patch, HIVE-18778.2.patch, HIVE-18778.3.patch, 
> HIVE-18778.4.patch, HIVE-18778.5.patch, HIVE-18778.6.patch, 
> HIVE-18778.7.patch, HIVE-18778.8.patch, HIVE-18778.9.branch-3.patch, 
> HIVE-18778.9.patch, HIVE-18778_TestCliDriver.patch, 
> HIVE-18788_SparkNegative.patch, HIVE-18788_SparkPerf.patch
>
>
> With Sentry enabled, commands like explain drop table foo fail with {{explain 
> drop table foo;}}
> {code}
> Error: Error while compiling statement: FAILED: SemanticException No valid 
> privileges
>  Required privilege( Table) not available in input privileges
>  The required privileges: (state=42000,code=4)
> {code}
> Sentry fails to authorize because the ExplainSemanticAnalyzer uses an 
> instance of DDLSemanticAnalyzer to analyze the explain query.
> {code}
> BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(conf, input);
> sem.analyze(input, ctx);
> sem.validate()
> {code}
> The inputs/outputs entities for this query are set in the above code. 
> However, these are never set on the instance of ExplainSemanticAnalyzer 
> itself and thus is not propagated into the HookContext in the calling Driver 
> code.
> {code}
> sem.analyze(tree, ctx); --> this results in calling the above code that uses 
> DDLSA
> hookCtx.update(sem); --> sem is an instance of ExplainSemanticAnalyzer, this 
> code attempts to update the HookContext with the input/output info from ESA 
> which is never set.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18778) Needs to capture input/output entities in explain

2018-09-27 Thread Daniel Dai (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630849#comment-16630849
 ] 

Daniel Dai commented on HIVE-18778:
---

sysdb failure is caused by HIVE-20423, mm_all failure is there for a while. 
Will fix separately.

Committed to branch-3.

> Needs to capture input/output entities in explain
> -
>
> Key: HIVE-18778
> URL: https://issues.apache.org/jira/browse/HIVE-18778
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-18778-SparkPositive.patch, HIVE-18778.1.patch, 
> HIVE-18778.10.branch-3.patch, HIVE-18778.11.branch-3.patch, 
> HIVE-18778.2.patch, HIVE-18778.3.patch, HIVE-18778.4.patch, 
> HIVE-18778.5.patch, HIVE-18778.6.patch, HIVE-18778.7.patch, 
> HIVE-18778.8.patch, HIVE-18778.9.branch-3.patch, HIVE-18778.9.patch, 
> HIVE-18778_TestCliDriver.patch, HIVE-18788_SparkNegative.patch, 
> HIVE-18788_SparkPerf.patch
>
>
> With Sentry enabled, commands like explain drop table foo fail with {{explain 
> drop table foo;}}
> {code}
> Error: Error while compiling statement: FAILED: SemanticException No valid 
> privileges
>  Required privilege( Table) not available in input privileges
>  The required privileges: (state=42000,code=4)
> {code}
> Sentry fails to authorize because the ExplainSemanticAnalyzer uses an 
> instance of DDLSemanticAnalyzer to analyze the explain query.
> {code}
> BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(conf, input);
> sem.analyze(input, ctx);
> sem.validate()
> {code}
> The inputs/outputs entities for this query are set in the above code. 
> However, these are never set on the instance of ExplainSemanticAnalyzer 
> itself and thus is not propagated into the HookContext in the calling Driver 
> code.
> {code}
> sem.analyze(tree, ctx); --> this results in calling the above code that uses 
> DDLSA
> hookCtx.update(sem); --> sem is an instance of ExplainSemanticAnalyzer, this 
> code attempts to update the HookContext with the input/output info from ESA 
> which is never set.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-20319) group by and union all always generate empty query result

2018-09-27 Thread Alice Fan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630821#comment-16630821
 ] 

Alice Fan edited comment on HIVE-20319 at 9/27/18 5:57 PM:
---

HIVE-12812 is committed to master branch today. Let me know if you still see 
the same error and can we close this Jira? Thank you for reporting the issue. 


was (Author: afan):
HIVE-12812 is committed to master branch today. Let me know if you still see 
the same error and can we close this Jira. Thank you for reporting the issue. 

> group by and union all always generate empty query result
> -
>
> Key: HIVE-20319
> URL: https://issues.apache.org/jira/browse/HIVE-20319
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.2
> Environment: Run on MR, hadoop 2.7.3
>Reporter: Wang Yan
>Priority: Blocker
>
> The following query always generates empty results which is wrong.
> {code:sql}
> create table if not exists test_table(column1 string, column2 int);
> insert into test_table values('a',1),('b',2);
> set hive.optimize.union.remove=true;
> select column1 from test_table group by column1
> union all
> select column1 from test_table group by column1;
> {code}
> Actual result : empty
> Expected result: 
> {code:java}
> a
> b
> a
> b
> {code}
> Note that correct result is generated when set 
> hive.optimize.union.remove=false.
> It seems like the fix in HIVE-12788 is insufficient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20523) Improve table statistics for Parquet format

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630800#comment-16630800
 ] 

Hive QA commented on HIVE-20523:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941445/HIVE-20523.6.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 15000 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nested_column_pruning] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_analyze] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_join] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_no_row_serde] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_struct_type_vectorization]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_non_dictionary_encoding_vectorization]
 (batchId=91)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_vectorization]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_decimal_date]
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part_project]
 (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_numeric_overflows]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_parquet_projection]
 (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types]
 (batchId=71)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=188)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_join] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_decimal_date]
 (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_part_project]
 (batchId=126)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_parquet_projection]
 (batchId=130)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14084/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14084/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14084/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941445 - PreCommit-HIVE-Build

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, 
> HIVE-20523.6.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-27 Thread Alice Fan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630824#comment-16630824
 ] 

Alice Fan commented on HIVE-19302:
--

[~ngangam]  All tests are passed at pre-commit build. Can we commit the change? 
Thanks.

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> HIVE-19302.6.patch, HIVE-19302.7.patch, table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20319) group by and union all always generate empty query result

2018-09-27 Thread Alice Fan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630821#comment-16630821
 ] 

Alice Fan commented on HIVE-20319:
--

HIVE-12812 is committed to master branch today. Let me know if you still see 
the same error and can we close this Jira. Thank you for reporting the issue. 

> group by and union all always generate empty query result
> -
>
> Key: HIVE-20319
> URL: https://issues.apache.org/jira/browse/HIVE-20319
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.2
> Environment: Run on MR, hadoop 2.7.3
>Reporter: Wang Yan
>Priority: Blocker
>
> The following query always generates empty results which is wrong.
> {code:sql}
> create table if not exists test_table(column1 string, column2 int);
> insert into test_table values('a',1),('b',2);
> set hive.optimize.union.remove=true;
> select column1 from test_table group by column1
> union all
> select column1 from test_table group by column1;
> {code}
> Actual result : empty
> Expected result: 
> {code:java}
> a
> b
> a
> b
> {code}
> Note that correct result is generated when set 
> hive.optimize.union.remove=false.
> It seems like the fix in HIVE-12788 is insufficient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12812) Enable mapred.input.dir.recursive by default to support union with aggregate function

2018-09-27 Thread Alice Fan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630808#comment-16630808
 ] 

Alice Fan commented on HIVE-12812:
--

Thank you [~ychena] :)

> Enable mapred.input.dir.recursive by default to support union with aggregate 
> function
> -
>
> Key: HIVE-12812
> URL: https://issues.apache.org/jira/browse/HIVE-12812
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Chaoyu Tang
>Assignee: Alice Fan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-12812.1.patch, HIVE-12812.2.patch, 
> HIVE-12812.patch, HIVE-12812.patch, HIVE-12812.patch
>
>
> When union remove optimization is enabled, union query with aggregate 
> function writes its subquery intermediate results to subdirs which needs 
> mapred.input.dir.recursive to be enabled in order to be fetched. This 
> property is not defined by default in Hive and often ignored by user, which 
> causes the query failure and is hard to be debugged.
> So we need set mapred.input.dir.recursive to true whenever union remove 
> optimization is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-18602) Implement Partition By Hash Index

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-18602:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> Implement Partition By Hash Index
> -
>
> Key: HIVE-18602
> URL: https://issues.apache.org/jira/browse/HIVE-18602
> Project: Hive
>  Issue Type: New Feature
>Reporter: BELUGA BEHR
>Assignee: Peter Vary
>Priority: Minor
>
> Borrowing the concept from MySQL.  This would also save us from having random 
> column values in the HDFS partition file path since the HASH value would be 
> hex and each one would be the same length.
>  
> https://dev.mysql.com/doc/refman/5.7/en/partitioning-hash.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20459) add ThriftHiveMetastore.get_open_txns(long txnid)

2018-09-27 Thread Igor Kryvenko (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630744#comment-16630744
 ] 

Igor Kryvenko commented on HIVE-20459:
--

[~ekoifman] Hi Eugene. Could you, review, please ?

Thanks, Ihor.

> add ThriftHiveMetastore.get_open_txns(long txnid)
> -
>
> Key: HIVE-20459
> URL: https://issues.apache.org/jira/browse/HIVE-20459
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Minor
> Attachments: HIVE-20459.01.patch, HIVE-20459.02.patch, 
> HIVE-20459.03.patch, HIVE-20459.04.patch
>
>
> we currently have {{ThriftHiveMetastore.get_open_txns()}} which maps to 
> {{TxnHandler.getOpenTxns()}}.  The usual usage is 
> {{TxnUtils.createValidReadTxnList(GetOpenTxnsResponse txns, long 
> currentTxn)}} where the complete list transactions is obtained from Metastore 
> and then anything above currentTxn is thrown away.  
> Would be useful to add {{ThriftHiveMetastore.get_open_txns(long txnid)}} and 
> {{TxnHandler.getOpenTxns(long)}} to not retrieve things that will be thrown 
> away.  Especially when there are a lot of running transactions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20523) Improve table statistics for Parquet format

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630738#comment-16630738
 ] 

Hive QA commented on HIVE-20523:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
0s{color} | {color:blue} ql in master has 2325 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} ql: The patch generated 0 new + 10 unchanged - 4 
fixed = 10 total (was 14) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14084/dev-support/hive-personality.sh
 |
| git revision | master / b382391 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14084/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, 
> HIVE-20523.6.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-12812) Enable mapred.input.dir.recursive by default to support union with aggregate function

2018-09-27 Thread Yongzhi Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-12812:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Commit the fix into master

> Enable mapred.input.dir.recursive by default to support union with aggregate 
> function
> -
>
> Key: HIVE-12812
> URL: https://issues.apache.org/jira/browse/HIVE-12812
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Chaoyu Tang
>Assignee: Alice Fan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-12812.1.patch, HIVE-12812.2.patch, 
> HIVE-12812.patch, HIVE-12812.patch, HIVE-12812.patch
>
>
> When union remove optimization is enabled, union query with aggregate 
> function writes its subquery intermediate results to subdirs which needs 
> mapred.input.dir.recursive to be enabled in order to be fetched. This 
> property is not defined by default in Hive and often ignored by user, which 
> causes the query failure and is hard to be debugged.
> So we need set mapred.input.dir.recursive to true whenever union remove 
> optimization is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20647) HadoopVer was ignored in QTestUtil

2018-09-27 Thread denys kuzmenko (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

denys kuzmenko updated HIVE-20647:
--
Attachment: HIVE-20647.1.patch

> HadoopVer was ignored in QTestUtil
> --
>
> Key: HIVE-20647
> URL: https://issues.apache.org/jira/browse/HIVE-20647
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20647.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20647) HadoopVer was ignored in QTestUtil

2018-09-27 Thread denys kuzmenko (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

denys kuzmenko updated HIVE-20647:
--
Attachment: (was: HIVE-20647.1.patch)

> HadoopVer was ignored in QTestUtil
> --
>
> Key: HIVE-20647
> URL: https://issues.apache.org/jira/browse/HIVE-20647
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-16889) Improve Performance Of VARCHAR

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-16889:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> Improve Performance Of VARCHAR
> --
>
> Key: HIVE-16889
> URL: https://issues.apache.org/jira/browse/HIVE-16889
> Project: Hive
>  Issue Type: Improvement
>  Components: Types
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Peter Vary
>Priority: Major
>
> Often times, organizations use tools that create table schemas on the fly and 
> they specify a  VARCHAR column with precision.  In these scenarios, 
> performance suffers even though one could assume performance should be better 
> since there is pre-existing knowledge about the size of the data and buffers 
> could be more efficiently setup then in the case where no such knowledge 
> exists.
> Most of the performance seems to be caused by reading a STRING from a file 
> into a byte buffer, checking the length of the STRING, truncating the STRING 
> if needed, and then serializing the STRING back into bytes again.
> From the code, I have identified several areas where develops left notes 
> about later improvements.
> # org.apache.hadoop.hive.serde2.io.HiveVarcharWritable.enforceMaxLength(int)
> # org.apache.hadoop.hive.serde2.lazy.LazyHiveVarchar.init(ByteArrayRef, int, 
> int)
> # 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getHiveVarchar(Object,
>  PrimitiveObjectInspector)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-17166) Break Locks On Timeout

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-17166:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> Break Locks On Timeout
> --
>
> Key: HIVE-17166
> URL: https://issues.apache.org/jira/browse/HIVE-17166
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Affects Versions: 1.2.2, 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Peter Vary
>Priority: Major
>
> Hive supports table and partition locks by utilizing ZooKeeper to keep track. 
>  There are several configurations related to this, including: 
> {{hive.lock.sleep.between.retries}} and {{hive.lock.numretries}}.
> I'd lile to propose a new boolean configuration that, when set, would alter 
> the behavior of a lock timeout failure.  Instead of failing the query with an 
> error, the configuration would direct Hive to break the lock and allow the 
> query try to obtain a new one and proceed.
> This is useful in cases where locks are leaked and erroneously left behind.  
> Such scenarios create permanent blocking on future queries.  This break-lock 
> mechanism would alleviate this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19098) Hive: impossible to insert data in a parquet's table with "union all" in the select query

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-19098:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> Hive: impossible to insert data in a parquet's table with "union all" in the 
> select query
> -
>
> Key: HIVE-19098
> URL: https://issues.apache.org/jira/browse/HIVE-19098
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Hive
>Affects Versions: 2.3.2
>Reporter: ACOSS
>Assignee: Peter Vary
>Priority: Minor
>
> Hello
> We have a parquet's table.
> We want to insert data in the table by a querie like this:
> "insert into my_table select * from my_select_table_1 union all select * from 
> my_select_table_2"
> It's fail with the error:
> 2018-04-03 15:49:28,898 FATAL [IPC Server handler 2 on 38465] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
> attempt_1522749003448_0028_m_00_0 - exited : java.io.IOException: 
> java.lang.reflect.InvocationTargetException
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:271)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:217)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:345)
>  at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:695)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:169)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:257)
>  ... 11 more
> Caused by: java.lang.NullPointerException
>  at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.ProjectionPusher.pushProjectionsAndFilters(ProjectionPusher.java:118)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.ProjectionPusher.pushProjectionsAndFilters(ProjectionPusher.java:189)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:75)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:75)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:60)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:75)
>  at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:99)
>  ... 16 more
>  
> Scenario:
> create table t1 (col1 string);
> create table t2 (col1 string);
> insert into t2 values ('2017');
> insert into t1 values ('2017');
> create table t3 (col1 string) STORED AS PARQUETFILE;
>  INSERT into t3 select col1 from t1 union all select col1 from t2; 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20639) Add ability to Write Data from Hive Table/Query to Kafka Topic

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630709#comment-16630709
 ] 

Hive QA commented on HIVE-20639:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941448/HIVE-20639.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14083/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14083/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14083/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12941448/HIVE-20639.2.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941448 - PreCommit-HIVE-Build

> Add ability to Write Data from Hive Table/Query to Kafka Topic
> --
>
> Key: HIVE-20639
> URL: https://issues.apache.org/jira/browse/HIVE-20639
> Project: Hive
>  Issue Type: New Feature
>  Components: kafka integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20639.2.patch, HIVE-20639.patch
>
>
> This patch adds multiple record writers to allow Hive user writing data 
> directly to a Kafka Topic.
> The writer provides multiple write semantics modes.
> * A None where all the records will be delivered with no guarantee or reties.
> * B At_least_once, each record will be delivered with retries from the Kafka 
> Producer and Hive Write Task. 
> * C Exactly_once , Writer will be using Kafka Transaction API to ensure that 
> each record is delivered once.
> In addition to the new feature i have refactored the existing code to make it 
> more readable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630707#comment-16630707
 ] 

Hive QA commented on HIVE-19302:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941440/HIVE-19302.7.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14999 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14082/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14082/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14082/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941440 - PreCommit-HIVE-Build

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> HIVE-19302.6.patch, HIVE-19302.7.patch, table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-18603) Use Hash For Partition HDFS File Path

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-18603:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> Use Hash For Partition HDFS File Path
> -
>
> Key: HIVE-18603
> URL: https://issues.apache.org/jira/browse/HIVE-18603
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 1.2.0, 2.3.0, 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: Peter Vary
>Priority: Minor
>
> Currently, for partitioned tables, Hive uses the literal value of each 
> partition in the HDFS file path.  Instead, perhaps we can use a hash value so 
> that:
>  
>  # The partitioned values are obscured to a casual observer in HDFS
>  # Remove the chance of having a very long HDFS file name when faced with a 
> very long partitioned value
>  # Remove the needs to worry about special characters in the partitioned path 
> name as the hash value would only be HEX string values.
>  
> The suggestion here is that we retain the partition values, just as is done 
> now, but the default HDFS location for each partition will use the hash of 
> the value instead of the value itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-17941) Don't Re-Create RunningJob Client During Status Checks

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-17941:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> Don't Re-Create RunningJob Client During Status Checks
> --
>
> Key: HIVE-17941
> URL: https://issues.apache.org/jira/browse/HIVE-17941
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.3.1
>Reporter: BELUGA BEHR
>Assignee: Peter Vary
>Priority: Major
>
> {code:java|title=org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper}
> while (!rj.isComplete()) {
>   ...
> RunningJob newRj = jc.getJob(rj.getID());
> if (newRj == null) {
>   // under exceptional load, hadoop may not be able to look up status
>   // of finished jobs (because it has purged them from memory). From
>   // hive's perspective - it's equivalent to the job having failed.
>   // So raise a meaningful exception
>   throw new IOException("Could not find status of job:" + rj.getID());
> } else {
>   th.setRunningJob(newRj);
>   rj = newRj;
> }
>   }
>   ...
> }
> {code}
> https://github.com/apache/hive/blob/a9f25c0e7ad3f81a9f00f601947a161516e33f1b/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java#L295-L306
> Every time we loop here for a status update, we are rebuilding the RunningJob 
> object to test if the Job information is still loaded in YARN.  Rebuilding 
> this RunningJob object is not trivial because it requires that we re-load and 
> parse the Job Configuration XML file every time.
> {code:java|title=Outdated Stacktrace But Same Idea Holds}
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.(FileInputStream.java:120)
> at 
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1924)
> at 
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1877)
> at 
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785)
> at org.apache.hadoop.conf.Configuration.get(Configuration.java:712)
> at 
> org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:1951)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:398)
> at org.apache.hadoop.mapred.JobConf.(JobConf.java:388)
> at 
> org.apache.hadoop.mapred.JobClient$NetworkedJob.(JobClient.java:174)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:655)
> at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:668)
> at 
> org.apache.hadoop.hive.ql.exec.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:282)
> at 
> org.apache.hadoop.hive.ql.exec.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:532)
> {code}
> Maybe we can be use {{isRetired()}} instead for this particular check.  We 
> also probably need to be better about checking the return value from any of 
> the {{RunningJob}} methods if it's the case that they can fail/go-away at any 
> time if YARN purges the information.  It seems that perhaps this was an 
> attempt to detect a purged job before exercising the {{RunningJob}} object... 
> even though it can go bad at any point.
> https://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/mapred/RunningJob.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-18415) Lower "Updating Partition Stats" Logging Level

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-18415:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> Lower "Updating Partition Stats" Logging Level
> --
>
> Key: HIVE-18415
> URL: https://issues.apache.org/jira/browse/HIVE-18415
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.2, 2.2.0, 3.0.0, 2.3.2
>Reporter: BELUGA BEHR
>Assignee: Peter Vary
>Priority: Trivial
>
> {code:title=org.apache.hadoop.hive.metastore.utils.MetaStoreUtils}
> LOG.warn("Updating partition stats fast for: " + part.getTableName());
> ...
> LOG.warn("Updated size to " + params.get(StatsSetupConst.TOTAL_SIZE));
> {code}
> This logging produces many lines of WARN log messages in my log file and it's 
> not clear to me what the issue is here.  Why is this a warning and how should 
> I respond to address this warning?
> DEBUG is probably more appropriate for a utility class.  Please lower.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-18512) Get Results ReadAhead

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-18512:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> Get Results ReadAhead
> -
>
> Key: HIVE-18512
> URL: https://issues.apache.org/jira/browse/HIVE-18512
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: Peter Vary
>Priority: Minor
>
> I don't have any data to back this up, but I wanted to put it on the radar.
> It may be possible to improve performance of HS2 with an HDFS read-ahead 
> reader for result data.  This would require adding a cache (configurable 
> size) to the Driver/Context object and adding a separate thread for loading 
> results asynchronously while the client is processing its current batch of 
> results.  It seems that currently, results are loaded on demand.
> https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L2298



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-18822) INSERT VALUES - HoS + Steaming File Format

2018-09-27 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-18822:
--

Assignee: Peter Vary  (was: Janaki Lahorani)

> INSERT VALUES - HoS + Steaming File Format
> --
>
> Key: HIVE-18822
> URL: https://issues.apache.org/jira/browse/HIVE-18822
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Peter Vary
>Priority: Minor
>
> Please optimize the INSERT VALUES function.  When HoS is being used, and a 
> streaming format such as TEXT or AVRO are being used, INSERT VALUES 
> statements should be quick.  The HiveServer2 should pass the vales to the 
> Executor and the Executor should simply append the data to an existing HDFS 
> file instead of creating a new one.  This will reduce the number of small 
> files that exist in the file system... or perhaps the HiveServer2 performs 
> the append without having to first sent the data to the processing engine at 
> all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20623) Shared work: Extend sharing of map-join cache entries in LLAP

2018-09-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20623:
---
Attachment: HIVE-20623.02.patch

> Shared work: Extend sharing of map-join cache entries in LLAP
> -
>
> Key: HIVE-20623
> URL: https://issues.apache.org/jira/browse/HIVE-20623
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Logical Optimizer
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20623.01.patch, HIVE-20623.02.patch, 
> HIVE-20623.02.patch, HIVE-20623.02.patch, HIVE-20623.patch, 
> hash-shared-work.json.txt, hash-shared-work.svg
>
>
> For a query like this
> {code}
> with all_sales as (
> select ss_customer_sk as customer_sk, ss_ext_list_price-ss_ext_discount_amt 
> as ext_price from store_sales
> UNION ALL
> select ws_bill_customer_sk as customer_sk, 
> ws_ext_list_price-ws_ext_discount_amt as ext_price from web_sales
> UNION ALL
> select cs_bill_customer_sk as customer_sk, cs_ext_sales_price - 
> cs_ext_discount_amt as ext_price from catalog_sales)
> select sum(ext_price) total_price, c_customer_id from all_sales, customer 
> where customer_sk = c_customer_sk
> group by c_customer_id
> order by total_price desc 
> limit 100;
> {code}
> The hashtable used for all 3 joins are identical, which is loaded 3x times in 
> the same LLAP instance because they are named.
> {code}
> cacheKey = "HASH_MAP_" + this.getOperatorId() + "_container";
> {code}
> in the cache.
> If those are identical in nature (i.e vectorization, hashtable type etc), 
> then the duplication is just wasted CPU, memory and network - using the cache 
> name for hashtables which will be identical in layout would be extremely 
> useful.
> In cases where the join is pushed through a UNION, those are identical.
> This optimization can only be done without concern for accidental delays when 
> the same upstream task is generating all of these hashtables, which is what 
> is achieved by the shared scan optimizer already.
> In case the shared work is not present, this has potential downsides - in 
> case two customer broadcasts were sourced from "Map 1" and "Map 2", the Map 1 
> builder will block the other task from reading from Map 2, even though Map 2 
> might have started after, but finished ahead of Map 1.
> So this specific optimization can always be considered for cases where the 
> shared work unifies the operator tree and the parents of all the RS entries 
> involved are same (& the RS layout is the same).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20636) Improve number of null values estimation after outer join

2018-09-27 Thread Jesus Camacho Rodriguez (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-20636:
---
Attachment: HIVE-20636.01.patch

> Improve number of null values estimation after outer join
> -
>
> Key: HIVE-20636
> URL: https://issues.apache.org/jira/browse/HIVE-20636
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20636.01.patch, HIVE-20636.01.patch, 
> HIVE-20636.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630644#comment-16630644
 ] 

Hive QA commented on HIVE-19302:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
46s{color} | {color:blue} ql in master has 2325 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14082/dev-support/hive-personality.sh
 |
| git revision | master / b382391 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14082/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> HIVE-19302.6.patch, HIVE-19302.7.patch, table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20459) add ThriftHiveMetastore.get_open_txns(long txnid)

2018-09-27 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16630606#comment-16630606
 ] 

Hive QA commented on HIVE-20459:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941437/HIVE-20459.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14999 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14081/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14081/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14081/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941437 - PreCommit-HIVE-Build

> add ThriftHiveMetastore.get_open_txns(long txnid)
> -
>
> Key: HIVE-20459
> URL: https://issues.apache.org/jira/browse/HIVE-20459
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Minor
> Attachments: HIVE-20459.01.patch, HIVE-20459.02.patch, 
> HIVE-20459.03.patch, HIVE-20459.04.patch
>
>
> we currently have {{ThriftHiveMetastore.get_open_txns()}} which maps to 
> {{TxnHandler.getOpenTxns()}}.  The usual usage is 
> {{TxnUtils.createValidReadTxnList(GetOpenTxnsResponse txns, long 
> currentTxn)}} where the complete list transactions is obtained from Metastore 
> and then anything above currentTxn is thrown away.  
> Would be useful to add {{ThriftHiveMetastore.get_open_txns(long txnid)}} and 
> {{TxnHandler.getOpenTxns(long)}} to not retrieve things that will be thrown 
> away.  Especially when there are a lot of running transactions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 156 matches

Mail list logo