date:20170423

[jira] [Updated] (HIVE-15788) Implement FastBloomFilter to use RoaringBitmap instead of long[]

2017-04-23 Thread Murali Vemulapati (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Murali Vemulapati updated HIVE-15788:
-
Attachment: HIVE-15788.patch

Please review the patch.

> Implement FastBloomFilter to use RoaringBitmap instead of long[] 
> -
>
> Key: HIVE-15788
> URL: https://issues.apache.org/jira/browse/HIVE-15788
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Gopal V
>Assignee: Murali Vemulapati
> Attachments: HIVE-15788.patch
>
>
> Currently, a bloom filter which is all 1s occupies the exact amount of space 
> as a bloom filter which is sparse.
> This is an entire waste of space and produces memory pressure and generate a 
> massive number of cache misses.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-15788) Implement FastBloomFilter to use RoaringBitmap instead of long[]

2017-04-23 Thread Murali Vemulapati (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Murali Vemulapati reassigned HIVE-15788:


Assignee: Murali Vemulapati

> Implement FastBloomFilter to use RoaringBitmap instead of long[] 
> -
>
> Key: HIVE-15788
> URL: https://issues.apache.org/jira/browse/HIVE-15788
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Gopal V
>Assignee: Murali Vemulapati
>
> Currently, a bloom filter which is all 1s occupies the exact amount of space 
> as a bloom filter which is sparse.
> This is an entire waste of space and produces memory pressure and generate a 
> massive number of cache misses.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated

2017-04-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980746#comment-15980746
 ] 

Hive QA commented on HIVE-16497:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12864705/HIVE-16497.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10628 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4857/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4857/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4857/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12864705 - PreCommit-HIVE-Build

> FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file 
> system operations should be impersonated
> --
>
> Key: HIVE-16497
> URL: https://issues.apache.org/jira/browse/HIVE-16497
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 3.0.0
>
> Attachments: HIVE-16497.1.patch, HIVE-16497.2.patch
>
>
> FileUtils.isActionPermittedForFileHierarchy checks if user has permissions 
> for given action. The checks are made by impersonating the user.
> However, the listing of child dirs are done as the hiveserver2 user. If the 
> hive user doesn't have permissions on the filesystem, it gives incorrect 
> error that the user doesn't have permissions to perform the action.
> Impersonating the end user for all file operations in that function is also 
> logically correct thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-15859) HoS: Write RPC messages in event loop

2017-04-23 Thread Yi Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980721#comment-15980721
 ] 

Yi Yao edited comment on HIVE-15859 at 4/24/17 5:41 AM:


hi, [~lirui]. I encountered the same issue in hive 1.*. It would be great that 
community could back-port the patch to hive 1.*.


was (Author: yiyao):
hi, [~lirui]. I encountered the same issue in hive 1.*. It would be great that 
community could back-port the patch to hive 1.*.

> HoS: Write RPC messages in event loop
> -
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.1.1
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Fix For: 2.2.0
>
> Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch, 
> HIVE-15859.3.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk8/yarn/nm/us

[jira] [Comment Edited] (HIVE-15859) HoS: Write RPC messages in event loop

2017-04-23 Thread Yi Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980721#comment-15980721
 ] 

Yi Yao edited comment on HIVE-15859 at 4/24/17 5:41 AM:


hi, [~lirui]. I encountered the same issue in hive 1. It would be great that 
community could back-port the patch to hive 1.


was (Author: yiyao):
hi, [~lirui]. I encountered the same issue in hive 1.*. It would be great that 
community could back-port the patch to hive 1.*.

> HoS: Write RPC messages in event loop
> -
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.1.1
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Fix For: 2.2.0
>
> Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch, 
> HIVE-15859.3.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk8/yarn/nm/userca

[jira] [Commented] (HIVE-16367) Null-safe equality <=> operator is not support with CBO

2017-04-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980731#comment-15980731
 ] 

Hive QA commented on HIVE-16367:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12864687/HIVE-16367.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10628 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[is_distinct_from]
 (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_is_not_distinct_from]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_predicate_pushdown]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[parquet_predicate_pushdown]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_group_by]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_nullsafe_join]
 (batchId=158)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_nullsafe] 
(batchId=129)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4856/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4856/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4856/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12864687 - PreCommit-HIVE-Build

> Null-safe equality <=> operator is not support with CBO 
> 
>
> Key: HIVE-16367
> URL: https://issues.apache.org/jira/browse/HIVE-16367
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16367.1.patch
>
>
> Calcite doesn't support such equality operator so hive bails out and goes 
> through non-cbo path. This could restrict it's usage with subqueries and 
> other cbo only features.
> Since {{<=>}} is equivalent to {{is not distinct from}} (HIVE-15986) we can 
> rewrite {{<=>}} to {{is not distinct from}} and enable CBO.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15859) HoS: Write RPC messages in event loop

2017-04-23 Thread Yi Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980721#comment-15980721
 ] 

Yi Yao commented on HIVE-15859:
---

hi, [~lirui]. I encountered the same issue in hive 1.*. It would be great that 
community could back-port the patch to hive 1.*.

> HoS: Write RPC messages in event loop
> -
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.1.1
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Fix For: 2.2.0
>
> Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch, 
> HIVE-15859.3.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in 
> stage 3.0 (TID 2515)
> 17/0

[jira] [Commented] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats

2017-04-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980707#comment-15980707
 ] 

Hive QA commented on HIVE-16495:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12864686/HIVE-16495.01.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 180 failed/errored test(s), 10630 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore]
 (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_file_format] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_clusterby_sortby]
 (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_skewed_table] 
(batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_add_partition]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_not_sorted] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_parts] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[binary_output_format] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket1] (batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket2] (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark1] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark2] 
(batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark3] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin5] 
(batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative2] 
(batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[column_names_with_leading_and_trailing_spaces]
 (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_infinity] 
(batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compustat_avro] 
(batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_alter_list_bucketing_table1]
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like2] 
(batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like] (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_tbl_props] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_view] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_skewed_table1] 
(batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_table_like_stats] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_with_constraints] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_location] 
(batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[default_file_format] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_comment_indent] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_comment_nonascii]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_formatted_view_partitioned]
 (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_syntax] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] 
(batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[display_colstats_tbllvl] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic1] (batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic2] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_intervals] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_timeseries] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_topn] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_map_ppr] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_map_ppr_multi_distinct]
 (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_ppr] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_ppr_multi_distinct]
 (batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_6] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input_part1] (batchId=

[jira] [Updated] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated

2017-04-23 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-16497:
-
Attachment: HIVE-16497.2.patch

Fixing the test failures.


> FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file 
> system operations should be impersonated
> --
>
> Key: HIVE-16497
> URL: https://issues.apache.org/jira/browse/HIVE-16497
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 3.0.0
>
> Attachments: HIVE-16497.1.patch, HIVE-16497.2.patch
>
>
> FileUtils.isActionPermittedForFileHierarchy checks if user has permissions 
> for given action. The checks are made by impersonating the user.
> However, the listing of child dirs are done as the hiveserver2 user. If the 
> hive user doesn't have permissions on the filesystem, it gives incorrect 
> error that the user doesn't have permissions to perform the action.
> Impersonating the end user for all file operations in that function is also 
> logically correct thing to do.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16503) LLAP: Oversubscribe memory for noconditional task size

2017-04-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980680#comment-15980680
 ] 

Hive QA commented on HIVE-16503:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12864689/HIVE-16503.2.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10629 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoinopt1] 
(batchId=74)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4854/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4854/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4854/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12864689 - PreCommit-HIVE-Build

> LLAP: Oversubscribe memory for noconditional task size
> --
>
> Key: HIVE-16503
> URL: https://issues.apache.org/jira/browse/HIVE-16503
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16503.1.patch, HIVE-16503.2.patch
>
>
> When running map joins in llap, it can potentially use more memory for hash 
> table loading (assuming other executors in the daemons have some memory to 
> spare). This map join conversion decision has to be made during compilation 
> that can provide some more room for LLAP. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16510) Vectorization: Add vectorized PTF tests in preparation for HIVE-16369

2017-04-23 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-16510:

Status: In Progress  (was: Patch Available)

> Vectorization: Add vectorized PTF tests in preparation for HIVE-16369
> -
>
> Key: HIVE-16510
> URL: https://issues.apache.org/jira/browse/HIVE-16510
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16510.01.patch
>
>
> Had trouble with HIVE-16369 patch being blocked by Apache SPAM filters -- so 
> separating out adding vectorized versions of current windowing_*.q tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

2017-04-23 Thread Colin Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Ma updated HIVE-16465:

Attachment: HIVE-16465-branch-2.3.001.patch

Update patch for branch 2.3

> NullPointer Exception when enable vectorization for Parquet file format
> ---
>
> Key: HIVE-16465
> URL: https://issues.apache.org/jira/browse/HIVE-16465
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Ma
>Assignee: Colin Ma
>Priority: Critical
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-16465.001.patch, HIVE-16465-branch-2.3.001.patch
>
>
> NullPointer Exception when enable vectorization for Parquet file format. It 
> is caused by the null value of the InputSplit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16287) Alter table partition rename with location - moves partition back to hive warehouse

2017-04-23 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980660#comment-15980660
 ] 

Rui Li commented on HIVE-16287:
---

QA on branch-1 has been broken for a long time. I think we can commit given no 
new failure is introduced.

> Alter table partition rename with location - moves partition back to hive 
> warehouse
> ---
>
> Key: HIVE-16287
> URL: https://issues.apache.org/jira/browse/HIVE-16287
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.0
> Environment: RHEL 6.8 
>Reporter: Ying Chen
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 1.3.0, 2.3.0, 3.0.0, 2.4.0
>
> Attachments: HIVE-16287.01.patch, HIVE-16287.02.patch, 
> HIVE-16287.03.patch, HIVE-16287.04.patch, HIVE-16287.05-branch-1.patch, 
> HIVE-16287-addedum.06.patch, HIVE-16287.branch-1.01.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> I was renaming my partition in a table that I've created using the location 
> clause, and noticed that when after rename is completed, my partition is 
> moved to the hive warehouse (hive.metastore.warehouse.dir).
> {quote}
> create table test_local_part (col1 int) partitioned by (col2 int) location 
> '/tmp/testtable/test_local_part';
> insert into test_local_part  partition (col2=1) values (1),(3);
> insert into test_local_part  partition (col2=2) values (3);
> alter table test_local_part partition (col2='1') rename to partition 
> (col2='4');
> {quote}
> Running: 
>describe formatted test_local_part partition (col2='2')
> # Detailed Partition Information   
> Partition Value:  [2]  
> Database: default  
> Table:test_local_part  
> CreateTime:   Mon Mar 20 13:25:28 PDT 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Location: 
> *hdfs://my.server.com:8020/tmp/testtable/test_local_part/col2=2*
> Running: 
>describe formatted test_local_part partition (col2='4')
> # Detailed Partition Information   
> Partition Value:  [4]  
> Database: default  
> Table:test_local_part  
> CreateTime:   Mon Mar 20 13:24:53 PDT 2017 
> LastAccessTime:   UNKNOWN  
> Protect Mode: None 
> Location: 
> *hdfs://my.server.com:8020/apps/hive/warehouse/test_local_part/col2=4*
> ---
> Per Sergio's comment - "The rename should create the new partition name in 
> the same location of the table. "



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-04-23 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980654#comment-15980654
 ] 

Rui Li commented on HIVE-16047:
---

The 2.2 branch is still building with Hadoop-2.7.2. My concern is if we revert 
the patch in 2.2 and when the new Hadoop (containing the breaking change and 
all the improvements) is released, we may find Hive-2.2 doesn't work with it 
anyway due to other compatibility issues. Then users of Hive-2.2 will have to 
live with the annoying logs. That's why I prefer to revert in master and keep 
it in 2.2 (or 2.x).

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-04-23 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980637#comment-15980637
 ] 

Ferdinand Xu edited comment on HIVE-16047 at 4/24/17 2:24 AM:
--

Thanks [~andrew.wang] for the suggestion. I prefer to revert it  in 2.2 branch 
considering retrieving KMS URI from Namenode is much better than configuration.


was (Author: ferd):
Thanks [~andrew.wang] for the suggestion. I prefer to revert it considering 
retrieving KMS URI from Namenode is much better than configuration in 2.2 
branch.

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16366) Hive 2.3 release planning

2017-04-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980639#comment-15980639
 ] 

Hive QA commented on HIVE-16366:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12864685/HIVE-16366-branch-2.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10570 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=142)
org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=174)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4853/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4853/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4853/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12864685 - PreCommit-HIVE-Build

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

2017-04-23 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980637#comment-15980637
 ] 

Ferdinand Xu commented on HIVE-16047:
-

Thanks [~andrew.wang] for the suggestion. I prefer to revert it considering 
retrieving KMS URI from Namenode is much better than configuration in 2.2 
branch.

> Shouldn't try to get KeyProvider unless encryption is enabled
> -
>
> Key: HIVE-16047
> URL: https://issues.apache.org/jira/browse/HIVE-16047
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch
>
>
> Found lots of following errors in HS2 log:
> {noformat}
> hdfs.KeyProviderCache: Could not find uri with key 
> [dfs.encryption.key.provider.uri] to create a keyProvider !!
> {noformat}
> Similar to HDFS-7931



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark

2017-04-23 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980630#comment-15980630
 ] 

Rui Li commented on HIVE-11133:
---

Hi [~stakiar], in the query plan of your 
[comment|https://issues.apache.org/jira/browse/HIVE-11133?focusedCommentId=15979385&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15979385],
 why it only displays Reducer 11 for Stage-2? Is the output truncated?
{noformat}
Stage-2
  Reducer 11
{noformat}

> Support hive.explain.user for Spark
> ---
>
> Key: HIVE-11133
> URL: https://issues.apache.org/jira/browse/HIVE-11133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Mohit Sabharwal
>Assignee: Sahil Takiar
> Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, 
> HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, 
> HIVE-11133.6.patch, HIVE-11133.7.patch, HIVE-11133.8.patch
>
>
> User friendly explain output ({{set hive.explain.user=true}}) should support 
> Spark as well. 
> Once supported, we should also enable related q-tests like {{explainuser_1.q}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16513) width_bucket issues

2017-04-23 Thread Carter Shanklin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980618#comment-15980618
 ] 

Carter Shanklin commented on HIVE-16513:


Another update, the case expression thing turns out to be another divide by 
zero thing that got swallowed up.

{code}
h exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 
3)) END, 10)
java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 
3)) END, 10)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2154)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 3)) END, 
10)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:93)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:442)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:434)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
... 13 more
Caused by: java.lang.ArithmeticException: / by zero
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFWidthBucket.evaluate(GenericUDFWidthBucket.java:75)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorHead._evaluate(ExprNodeEvaluatorHead.java:44)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
... 18 more
{code}

> width_bucket issues
> ---
>
> Key: HIVE-16513
> URL: https://issues.apache.org/jira/browse/HIVE-16513
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>
> width_bucket was recently added with HIVE-15982. This ticket notes a few 
> issues.
> Usability issue:
> Currently only accepts integral numeric types. Decimals, floats and doubles 
> are not supported.
> Runtime failures: This query will cause a runtime divide-by-zero in the 
> reduce stage.
> select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1;
> The divide-by-zero seems to trigger any time I use a group-by. Here's another 
> example (that actually requires the group-by):
> select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1;
> Advanced Usage Issues:
> Suppose you have a table e011_01 as follows:
> create table e011_01 (c1 integer, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> Compile-time problems:
> You cannot use simple case expressions, searched case expressions or grouping 
> sets. These queries fail:
> select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as 
> integer)) from e011_02 group by cube(c1, c2);
> I'll admit the grouping one is pretty contrived but the case ones seem 
> straightforward, valid, and it's strange that they don't work. Similar 
> queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe 
> [~ashutoshc] can lend some perspective on that?
>

[jira] [Commented] (HIVE-16510) Vectorization: Add vectorized PTF tests in preparation for HIVE-16369

2017-04-23 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980608#comment-15980608
 ] 

Gopal V commented on HIVE-16510:


The reduce vectorization is not kicking in for the .q.out files - approve of 
the patch, but this needs MiniLlapLocalDriver .q.out files generated.

+1 pending those tests.

> Vectorization: Add vectorized PTF tests in preparation for HIVE-16369
> -
>
> Key: HIVE-16510
> URL: https://issues.apache.org/jira/browse/HIVE-16510
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-16510.01.patch
>
>
> Had trouble with HIVE-16369 patch being blocked by Apache SPAM filters -- so 
> separating out adding vectorized versions of current windowing_*.q tests.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16503) LLAP: Oversubscribe memory for noconditional task size

2017-04-23 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16503:
-
Attachment: HIVE-16503.2.patch

> LLAP: Oversubscribe memory for noconditional task size
> --
>
> Key: HIVE-16503
> URL: https://issues.apache.org/jira/browse/HIVE-16503
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16503.1.patch, HIVE-16503.2.patch
>
>
> When running map joins in llap, it can potentially use more memory for hash 
> table loading (assuming other executors in the daemons have some memory to 
> spare). This map join conversion decision has to be made during compilation 
> that can provide some more room for LLAP. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16367) Null-safe equality <=> operator is not support with CBO

2017-04-23 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16367:
---
Status: Patch Available  (was: Open)

> Null-safe equality <=> operator is not support with CBO 
> 
>
> Key: HIVE-16367
> URL: https://issues.apache.org/jira/browse/HIVE-16367
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16367.1.patch
>
>
> Calcite doesn't support such equality operator so hive bails out and goes 
> through non-cbo path. This could restrict it's usage with subqueries and 
> other cbo only features.
> Since {{<=>}} is equivalent to {{is not distinct from}} (HIVE-15986) we can 
> rewrite {{<=>}} to {{is not distinct from}} and enable CBO.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16367) Null-safe equality <=> operator is not support with CBO

2017-04-23 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980585#comment-15980585
 ] 

Vineet Garg commented on HIVE-16367:


Attaching initial patch which enables CBO for {{<=>}} by rewriting it into {{is 
not distinct from}}. This patch has an issue where a join with {{<=>}} is now 
cross product instead of inner join. This needs to be investigated and fixed.

> Null-safe equality <=> operator is not support with CBO 
> 
>
> Key: HIVE-16367
> URL: https://issues.apache.org/jira/browse/HIVE-16367
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16367.1.patch
>
>
> Calcite doesn't support such equality operator so hive bails out and goes 
> through non-cbo path. This could restrict it's usage with subqueries and 
> other cbo only features.
> Since {{<=>}} is equivalent to {{is not distinct from}} (HIVE-15986) we can 
> rewrite {{<=>}} to {{is not distinct from}} and enable CBO.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16367) Null-safe equality <=> operator is not support with CBO

2017-04-23 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16367:
---
Attachment: HIVE-16367.1.patch

> Null-safe equality <=> operator is not support with CBO 
> 
>
> Key: HIVE-16367
> URL: https://issues.apache.org/jira/browse/HIVE-16367
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16367.1.patch
>
>
> Calcite doesn't support such equality operator so hive bails out and goes 
> through non-cbo path. This could restrict it's usage with subqueries and 
> other cbo only features.
> Since {{<=>}} is equivalent to {{is not distinct from}} (HIVE-15986) we can 
> rewrite {{<=>}} to {{is not distinct from}} and enable CBO.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15982) Support the width_bucket function

2017-04-23 Thread Carter Shanklin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980578#comment-15980578
 ] 

Carter Shanklin commented on HIVE-15982:


I ran a test suite I use against this and noted a few issues in HIVE-16513. I 
think the biggest challenge will be that floating point numbers can't be used. 
I hope I didn't confuse the issue when I said "numeric value expressions" above.

> Support the width_bucket function
> -
>
> Key: HIVE-15982
> URL: https://issues.apache.org/jira/browse/HIVE-15982
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Assignee: Sahil Takiar
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, 
> HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch, HIVE-15982.6.patch
>
>
> Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer 
> between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by 
> dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, 
> if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16513) width_bucket issues

2017-04-23 Thread Carter Shanklin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980577#comment-15980577
 ] 

Carter Shanklin commented on HIVE-16513:


Further note, the error you get for the case items is:

select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) from 
e011_01;
OK
Failed with exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error 
evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 
3)) END, 10)
Time taken: 0.087 seconds

> width_bucket issues
> ---
>
> Key: HIVE-16513
> URL: https://issues.apache.org/jira/browse/HIVE-16513
> Project: Hive
>  Issue Type: Bug
>Reporter: Carter Shanklin
>
> width_bucket was recently added with HIVE-15982. This ticket notes a few 
> issues.
> Usability issue:
> Currently only accepts integral numeric types. Decimals, floats and doubles 
> are not supported.
> Runtime failures: This query will cause a runtime divide-by-zero in the 
> reduce stage.
> select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1;
> The divide-by-zero seems to trigger any time I use a group-by. Here's another 
> example (that actually requires the group-by):
> select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1;
> Advanced Usage Issues:
> Suppose you have a table e011_01 as follows:
> create table e011_01 (c1 integer, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> Compile-time problems:
> You cannot use simple case expressions, searched case expressions or grouping 
> sets. These queries fail:
> select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) 
> from e011_01;
> select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as 
> integer)) from e011_02 group by cube(c1, c2);
> I'll admit the grouping one is pretty contrived but the case ones seem 
> straightforward, valid, and it's strange that they don't work. Similar 
> queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe 
> [~ashutoshc] can lend some perspective on that?
> Interestingly, you can use window functions in width_bucket, example:
> select width_bucket(rank() over (order by c2), 0, 10, 10) from e011_01;
> works just fine. Hopefully we can get to a place where people implementing 
> functions like this don't need to think about value expression support but we 
> don't seem to be there yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats

2017-04-23 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16495:
---
Status: Patch Available  (was: Open)

> ColumnStats merge should consider the accuracy of the current stats
> ---
>
> Key: HIVE-16495
> URL: https://issues.apache.org/jira/browse/HIVE-16495
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16495.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats

2017-04-23 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16495:
---
Attachment: HIVE-16495.01.patch

> ColumnStats merge should consider the accuracy of the current stats
> ---
>
> Key: HIVE-16495
> URL: https://issues.apache.org/jira/browse/HIVE-16495
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-16495.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-23 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Patch Available  (was: Open)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-23 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Status: Open  (was: Patch Available)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-23 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Attachment: HIVE-16366-branch-2.3.patch

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
> Attachments: HIVE-16366-branch-2.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

2017-04-23 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16366:
---
Attachment: (was: HIVE-16366-branch-2.3.patch)

> Hive 2.3 release planning
> -
>
> Key: HIVE-16366
> URL: https://issues.apache.org/jira/browse/HIVE-16366
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
>  Labels: 2.3.0
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-23 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980455#comment-15980455
 ] 

Hive QA commented on HIVE-16079:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12864561/HIVE-16079.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10628 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4852/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4852/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4852/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12864561 - PreCommit-HIVE-Build

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Issue Comment Deleted] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-23 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-16079:
---
Comment: was deleted

(was: 

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12864561/HIVE-16079.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10626 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] 
(batchId=225)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=143)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=96)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4834/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4834/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4834/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12864561 - PreCommit-HIVE-Build)

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive

2017-04-23 Thread Simanchal Das (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980314#comment-15980314
 ] 

Simanchal Das commented on HIVE-15229:
--

Hi [~cwsteinbach] I have refreshed the patch with latest code.

> 'like any' and 'like all' operators in hive
> ---
>
> Key: HIVE-15229
> URL: https://issues.apache.org/jira/browse/HIVE-15229
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Reporter: Simanchal Das
>Assignee: Simanchal Das
>Priority: Minor
> Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, 
> HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch, HIVE-15229.6.patch
>
>
> In Teradata 'like any' and 'like all' operators are mostly used when we are 
> matching a text field with numbers of patterns.
> 'like any' and 'like all' operator are equivalents of multiple like operator 
> like example below.
> {noformat}
> --like any
> select col1 from table1 where col2 like any ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like condition 
> select col1 from table1 where col2 like '%accountant%' or col2 like 
> '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like 
> '%insurance%' ;
> --like all
> select col1 from table1 where col2 like all ('%accountant%', '%accounting%', 
> '%retail%', '%bank%', '%insurance%');
> --Can be written using multiple like operator 
> select col1 from table1 where col2 like '%accountant%' and col2 like 
> '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like 
> '%insurance%' ;
> {noformat}
> Problem statement:
> Now a days so many data warehouse projects are being migrated from Teradata 
> to Hive.
> Always Data engineer and Business analyst are searching for these two 
> operator.
> If we introduce these two operator in hive then so many scripts will be 
> migrated smoothly instead of converting these operators to multiple like 
> operators.
> Result:
> 1. 'LIKE ANY' operator return true if a text(column value) matches to any 
> pattern.
> 2. 'LIKE ALL' operator return true if a text(column value) matches to all 
> patterns.
> 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the 
> left hand side is NULL, but also if one of the pattern in the list is NULL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15788) Implement FastBloomFilter to use RoaringBitmap instead of long[]

[jira] [Assigned] (HIVE-15788) Implement FastBloomFilter to use RoaringBitmap instead of long[]

[jira] [Commented] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated

[jira] [Comment Edited] (HIVE-15859) HoS: Write RPC messages in event loop

[jira] [Comment Edited] (HIVE-15859) HoS: Write RPC messages in event loop

[jira] [Commented] (HIVE-16367) Null-safe equality <=> operator is not support with CBO

[jira] [Commented] (HIVE-15859) HoS: Write RPC messages in event loop

[jira] [Commented] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats

[jira] [Updated] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated

[jira] [Commented] (HIVE-16503) LLAP: Oversubscribe memory for noconditional task size

[jira] [Updated] (HIVE-16510) Vectorization: Add vectorized PTF tests in preparation for HIVE-16369

[jira] [Updated] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format

[jira] [Commented] (HIVE-16287) Alter table partition rename with location - moves partition back to hive warehouse

[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

[jira] [Comment Edited] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

[jira] [Commented] (HIVE-16366) Hive 2.3 release planning

[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled

[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark

[jira] [Commented] (HIVE-16513) width_bucket issues

[jira] [Commented] (HIVE-16510) Vectorization: Add vectorized PTF tests in preparation for HIVE-16369

[jira] [Updated] (HIVE-16503) LLAP: Oversubscribe memory for noconditional task size

[jira] [Updated] (HIVE-16367) Null-safe equality <=> operator is not support with CBO

[jira] [Commented] (HIVE-16367) Null-safe equality <=> operator is not support with CBO

[jira] [Updated] (HIVE-16367) Null-safe equality <=> operator is not support with CBO

[jira] [Commented] (HIVE-15982) Support the width_bucket function

[jira] [Commented] (HIVE-16513) width_bucket issues

[jira] [Updated] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats

[jira] [Updated] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats

[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

[jira] [Updated] (HIVE-16366) Hive 2.3 release planning

[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

[jira] [Issue Comment Deleted] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive

35 matches

Site Navigation

Mail list logo

Footer information