[jira] [Updated] (HIVE-15788) Implement FastBloomFilter to use RoaringBitmap instead of long[]
[ https://issues.apache.org/jira/browse/HIVE-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Murali Vemulapati updated HIVE-15788: - Attachment: HIVE-15788.patch Please review the patch. > Implement FastBloomFilter to use RoaringBitmap instead of long[] > - > > Key: HIVE-15788 > URL: https://issues.apache.org/jira/browse/HIVE-15788 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Gopal V >Assignee: Murali Vemulapati > Attachments: HIVE-15788.patch > > > Currently, a bloom filter which is all 1s occupies the exact amount of space > as a bloom filter which is sparse. > This is an entire waste of space and produces memory pressure and generate a > massive number of cache misses. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HIVE-15788) Implement FastBloomFilter to use RoaringBitmap instead of long[]
[ https://issues.apache.org/jira/browse/HIVE-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Murali Vemulapati reassigned HIVE-15788: Assignee: Murali Vemulapati > Implement FastBloomFilter to use RoaringBitmap instead of long[] > - > > Key: HIVE-15788 > URL: https://issues.apache.org/jira/browse/HIVE-15788 > Project: Hive > Issue Type: Improvement > Components: UDF >Reporter: Gopal V >Assignee: Murali Vemulapati > > Currently, a bloom filter which is all 1s occupies the exact amount of space > as a bloom filter which is sparse. > This is an entire waste of space and produces memory pressure and generate a > massive number of cache misses. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated
[ https://issues.apache.org/jira/browse/HIVE-16497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980746#comment-15980746 ] Hive QA commented on HIVE-16497: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864705/HIVE-16497.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10628 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] (batchId=225) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4857/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4857/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4857/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12864705 - PreCommit-HIVE-Build > FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file > system operations should be impersonated > -- > > Key: HIVE-16497 > URL: https://issues.apache.org/jira/browse/HIVE-16497 > Project: Hive > Issue Type: Bug > Components: Authorization >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 3.0.0 > > Attachments: HIVE-16497.1.patch, HIVE-16497.2.patch > > > FileUtils.isActionPermittedForFileHierarchy checks if user has permissions > for given action. The checks are made by impersonating the user. > However, the listing of child dirs are done as the hiveserver2 user. If the > hive user doesn't have permissions on the filesystem, it gives incorrect > error that the user doesn't have permissions to perform the action. > Impersonating the end user for all file operations in that function is also > logically correct thing to do. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-15859) HoS: Write RPC messages in event loop
[ https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980721#comment-15980721 ] Yi Yao edited comment on HIVE-15859 at 4/24/17 5:41 AM: hi, [~lirui]. I encountered the same issue in hive 1.*. It would be great that community could back-port the patch to hive 1.*. was (Author: yiyao): hi, [~lirui]. I encountered the same issue in hive 1.*. It would be great that community could back-port the patch to hive 1.*. > HoS: Write RPC messages in event loop > - > > Key: HIVE-15859 > URL: https://issues.apache.org/jira/browse/HIVE-15859 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 2.1.1 > Environment: hadoop2.7.1 > spark1.6.2 > hive2.2 >Reporter: KaiXu >Assignee: Rui Li > Fix For: 2.2.0 > > Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch, > HIVE-15859.3.patch > > > Hive on Spark, failed with error: > {noformat} > 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 > Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 > Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: > 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC > channel is closed.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > {noformat} > application log shows the driver commanded a shutdown with some unknown > reason, but hive's log shows Driver could not get RPC header( Expected RPC > header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead). > {noformat} > 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = > hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in > stage 3.0 (TID 2519) > 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver > commanded a shutdown > 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared > 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped > 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = > hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml > 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown > (hsx-node1:42777) driver disconnected. > 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver > 192.168.1.1:42777 disassociated! Shutting down. > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in > stage 3.0 (TID 2511) > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Shutting down remote daemon. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Remote daemon shut down; proceeding with flushing remote transports. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Remoting shut down. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk8/yarn/nm/us
[jira] [Comment Edited] (HIVE-15859) HoS: Write RPC messages in event loop
[ https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980721#comment-15980721 ] Yi Yao edited comment on HIVE-15859 at 4/24/17 5:41 AM: hi, [~lirui]. I encountered the same issue in hive 1. It would be great that community could back-port the patch to hive 1. was (Author: yiyao): hi, [~lirui]. I encountered the same issue in hive 1.*. It would be great that community could back-port the patch to hive 1.*. > HoS: Write RPC messages in event loop > - > > Key: HIVE-15859 > URL: https://issues.apache.org/jira/browse/HIVE-15859 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 2.1.1 > Environment: hadoop2.7.1 > spark1.6.2 > hive2.2 >Reporter: KaiXu >Assignee: Rui Li > Fix For: 2.2.0 > > Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch, > HIVE-15859.3.patch > > > Hive on Spark, failed with error: > {noformat} > 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 > Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 > Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: > 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC > channel is closed.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > {noformat} > application log shows the driver commanded a shutdown with some unknown > reason, but hive's log shows Driver could not get RPC header( Expected RPC > header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead). > {noformat} > 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = > hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in > stage 3.0 (TID 2519) > 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver > commanded a shutdown > 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared > 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped > 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = > hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml > 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown > (hsx-node1:42777) driver disconnected. > 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver > 192.168.1.1:42777 disassociated! Shutting down. > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in > stage 3.0 (TID 2511) > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Shutting down remote daemon. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Remote daemon shut down; proceeding with flushing remote transports. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Remoting shut down. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk8/yarn/nm/userca
[jira] [Commented] (HIVE-16367) Null-safe equality <=> operator is not support with CBO
[ https://issues.apache.org/jira/browse/HIVE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980731#comment-15980731 ] Hive QA commented on HIVE-16367: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864687/HIVE-16367.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10628 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] (batchId=225) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[is_distinct_from] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_is_not_distinct_from] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_predicate_pushdown] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[parquet_predicate_pushdown] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_group_by] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_nullsafe_join] (batchId=158) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_nullsafe] (batchId=129) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4856/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4856/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4856/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12864687 - PreCommit-HIVE-Build > Null-safe equality <=> operator is not support with CBO > > > Key: HIVE-16367 > URL: https://issues.apache.org/jira/browse/HIVE-16367 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16367.1.patch > > > Calcite doesn't support such equality operator so hive bails out and goes > through non-cbo path. This could restrict it's usage with subqueries and > other cbo only features. > Since {{<=>}} is equivalent to {{is not distinct from}} (HIVE-15986) we can > rewrite {{<=>}} to {{is not distinct from}} and enable CBO. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15859) HoS: Write RPC messages in event loop
[ https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980721#comment-15980721 ] Yi Yao commented on HIVE-15859: --- hi, [~lirui]. I encountered the same issue in hive 1.*. It would be great that community could back-port the patch to hive 1.*. > HoS: Write RPC messages in event loop > - > > Key: HIVE-15859 > URL: https://issues.apache.org/jira/browse/HIVE-15859 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 2.1.1 > Environment: hadoop2.7.1 > spark1.6.2 > hive2.2 >Reporter: KaiXu >Assignee: Rui Li > Fix For: 2.2.0 > > Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch, > HIVE-15859.3.patch > > > Hive on Spark, failed with error: > {noformat} > 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 > Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 > Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: > 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1 > Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC > channel is closed.)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask > {noformat} > application log shows the driver commanded a shutdown with some unknown > reason, but hive's log shows Driver could not get RPC header( Expected RPC > header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead). > {noformat} > 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = > hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in > stage 3.0 (TID 2519) > 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver > commanded a shutdown > 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared > 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped > 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = > hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml > 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown > (hsx-node1:42777) driver disconnected. > 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver > 192.168.1.1:42777 disassociated! Shutting down. > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in > stage 3.0 (TID 2511) > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Shutting down remote daemon. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Remote daemon shut down; proceeding with flushing remote transports. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a > 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: > Remoting shut down. > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1 > 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory > /mnt/disk8/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-4de34175-f871-4c28-8ec0-d2fc0020c5c3 > 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1137.0 in > stage 3.0 (TID 2515) > 17/0
[jira] [Commented] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats
[ https://issues.apache.org/jira/browse/HIVE-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980707#comment-15980707 ] Hive QA commented on HIVE-16495: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864686/HIVE-16495.01.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 180 failed/errored test(s), 10630 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] (batchId=225) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore] (batchId=237) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_file_format] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_clusterby_sortby] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_skewed_table] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_add_partition] (batchId=17) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_not_sorted] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_parts] (batchId=45) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[binary_output_format] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket1] (batchId=40) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket2] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark1] (batchId=64) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark2] (batchId=2) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_spark3] (batchId=42) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin5] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative2] (batchId=65) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[column_names_with_leading_and_trailing_spaces] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_infinity] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compustat_avro] (batchId=83) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_alter_list_bucketing_table1] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like2] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_tbl_props] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_view] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_skewed_table1] (batchId=18) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_table_like_stats] (batchId=56) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_with_constraints] (batchId=64) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_location] (batchId=82) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[default_file_format] (batchId=21) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_comment_indent] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_comment_nonascii] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_formatted_view_partitioned] (batchId=72) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_syntax] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] (batchId=41) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[display_colstats_tbllvl] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic1] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic2] (batchId=11) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_intervals] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_timeseries] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_topn] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_map_ppr] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_map_ppr_multi_distinct] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_ppr] (batchId=28) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_ppr_multi_distinct] (batchId=54) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_6] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=8) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input_part1] (batchId=
[jira] [Updated] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated
[ https://issues.apache.org/jira/browse/HIVE-16497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-16497: - Attachment: HIVE-16497.2.patch Fixing the test failures. > FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file > system operations should be impersonated > -- > > Key: HIVE-16497 > URL: https://issues.apache.org/jira/browse/HIVE-16497 > Project: Hive > Issue Type: Bug > Components: Authorization >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 3.0.0 > > Attachments: HIVE-16497.1.patch, HIVE-16497.2.patch > > > FileUtils.isActionPermittedForFileHierarchy checks if user has permissions > for given action. The checks are made by impersonating the user. > However, the listing of child dirs are done as the hiveserver2 user. If the > hive user doesn't have permissions on the filesystem, it gives incorrect > error that the user doesn't have permissions to perform the action. > Impersonating the end user for all file operations in that function is also > logically correct thing to do. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16503) LLAP: Oversubscribe memory for noconditional task size
[ https://issues.apache.org/jira/browse/HIVE-16503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980680#comment-15980680 ] Hive QA commented on HIVE-16503: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864689/HIVE-16503.2.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10629 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] (batchId=225) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[skewjoinopt1] (batchId=74) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4854/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4854/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4854/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12864689 - PreCommit-HIVE-Build > LLAP: Oversubscribe memory for noconditional task size > -- > > Key: HIVE-16503 > URL: https://issues.apache.org/jira/browse/HIVE-16503 > Project: Hive > Issue Type: Improvement > Components: llap >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-16503.1.patch, HIVE-16503.2.patch > > > When running map joins in llap, it can potentially use more memory for hash > table loading (assuming other executors in the daemons have some memory to > spare). This map join conversion decision has to be made during compilation > that can provide some more room for LLAP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16510) Vectorization: Add vectorized PTF tests in preparation for HIVE-16369
[ https://issues.apache.org/jira/browse/HIVE-16510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16510: Status: In Progress (was: Patch Available) > Vectorization: Add vectorized PTF tests in preparation for HIVE-16369 > - > > Key: HIVE-16510 > URL: https://issues.apache.org/jira/browse/HIVE-16510 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16510.01.patch > > > Had trouble with HIVE-16369 patch being blocked by Apache SPAM filters -- so > separating out adding vectorized versions of current windowing_*.q tests. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16465) NullPointer Exception when enable vectorization for Parquet file format
[ https://issues.apache.org/jira/browse/HIVE-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Ma updated HIVE-16465: Attachment: HIVE-16465-branch-2.3.001.patch Update patch for branch 2.3 > NullPointer Exception when enable vectorization for Parquet file format > --- > > Key: HIVE-16465 > URL: https://issues.apache.org/jira/browse/HIVE-16465 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Colin Ma >Assignee: Colin Ma >Priority: Critical > Fix For: 2.3.0, 3.0.0 > > Attachments: HIVE-16465.001.patch, HIVE-16465-branch-2.3.001.patch > > > NullPointer Exception when enable vectorization for Parquet file format. It > is caused by the null value of the InputSplit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16287) Alter table partition rename with location - moves partition back to hive warehouse
[ https://issues.apache.org/jira/browse/HIVE-16287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980660#comment-15980660 ] Rui Li commented on HIVE-16287: --- QA on branch-1 has been broken for a long time. I think we can commit given no new failure is introduced. > Alter table partition rename with location - moves partition back to hive > warehouse > --- > > Key: HIVE-16287 > URL: https://issues.apache.org/jira/browse/HIVE-16287 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.1.0 > Environment: RHEL 6.8 >Reporter: Ying Chen >Assignee: Vihang Karajgaonkar >Priority: Minor > Fix For: 1.3.0, 2.3.0, 3.0.0, 2.4.0 > > Attachments: HIVE-16287.01.patch, HIVE-16287.02.patch, > HIVE-16287.03.patch, HIVE-16287.04.patch, HIVE-16287.05-branch-1.patch, > HIVE-16287-addedum.06.patch, HIVE-16287.branch-1.01.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > I was renaming my partition in a table that I've created using the location > clause, and noticed that when after rename is completed, my partition is > moved to the hive warehouse (hive.metastore.warehouse.dir). > {quote} > create table test_local_part (col1 int) partitioned by (col2 int) location > '/tmp/testtable/test_local_part'; > insert into test_local_part partition (col2=1) values (1),(3); > insert into test_local_part partition (col2=2) values (3); > alter table test_local_part partition (col2='1') rename to partition > (col2='4'); > {quote} > Running: >describe formatted test_local_part partition (col2='2') > # Detailed Partition Information > Partition Value: [2] > Database: default > Table:test_local_part > CreateTime: Mon Mar 20 13:25:28 PDT 2017 > LastAccessTime: UNKNOWN > Protect Mode: None > Location: > *hdfs://my.server.com:8020/tmp/testtable/test_local_part/col2=2* > Running: >describe formatted test_local_part partition (col2='4') > # Detailed Partition Information > Partition Value: [4] > Database: default > Table:test_local_part > CreateTime: Mon Mar 20 13:24:53 PDT 2017 > LastAccessTime: UNKNOWN > Protect Mode: None > Location: > *hdfs://my.server.com:8020/apps/hive/warehouse/test_local_part/col2=4* > --- > Per Sergio's comment - "The rename should create the new partition name in > the same location of the table. " -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled
[ https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980654#comment-15980654 ] Rui Li commented on HIVE-16047: --- The 2.2 branch is still building with Hadoop-2.7.2. My concern is if we revert the patch in 2.2 and when the new Hadoop (containing the breaking change and all the improvements) is released, we may find Hive-2.2 doesn't work with it anyway due to other compatibility issues. Then users of Hive-2.2 will have to live with the annoying logs. That's why I prefer to revert in master and keep it in 2.2 (or 2.x). > Shouldn't try to get KeyProvider unless encryption is enabled > - > > Key: HIVE-16047 > URL: https://issues.apache.org/jira/browse/HIVE-16047 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch > > > Found lots of following errors in HS2 log: > {noformat} > hdfs.KeyProviderCache: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > {noformat} > Similar to HDFS-7931 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled
[ https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980637#comment-15980637 ] Ferdinand Xu edited comment on HIVE-16047 at 4/24/17 2:24 AM: -- Thanks [~andrew.wang] for the suggestion. I prefer to revert it in 2.2 branch considering retrieving KMS URI from Namenode is much better than configuration. was (Author: ferd): Thanks [~andrew.wang] for the suggestion. I prefer to revert it considering retrieving KMS URI from Namenode is much better than configuration in 2.2 branch. > Shouldn't try to get KeyProvider unless encryption is enabled > - > > Key: HIVE-16047 > URL: https://issues.apache.org/jira/browse/HIVE-16047 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch > > > Found lots of following errors in HS2 log: > {noformat} > hdfs.KeyProviderCache: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > {noformat} > Similar to HDFS-7931 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16366) Hive 2.3 release planning
[ https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980639#comment-15980639 ] Hive QA commented on HIVE-16366: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864685/HIVE-16366-branch-2.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10570 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[comments] (batchId=35) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=142) org.apache.hive.hcatalog.api.TestHCatClient.testTransportFailure (batchId=174) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4853/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4853/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4853/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12864685 - PreCommit-HIVE-Build > Hive 2.3 release planning > - > > Key: HIVE-16366 > URL: https://issues.apache.org/jira/browse/HIVE-16366 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Blocker > Labels: 2.3.0 > Fix For: 2.3.0 > > Attachments: HIVE-16366-branch-2.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16047) Shouldn't try to get KeyProvider unless encryption is enabled
[ https://issues.apache.org/jira/browse/HIVE-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980637#comment-15980637 ] Ferdinand Xu commented on HIVE-16047: - Thanks [~andrew.wang] for the suggestion. I prefer to revert it considering retrieving KMS URI from Namenode is much better than configuration in 2.2 branch. > Shouldn't try to get KeyProvider unless encryption is enabled > - > > Key: HIVE-16047 > URL: https://issues.apache.org/jira/browse/HIVE-16047 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li >Priority: Minor > Fix For: 2.2.0 > > Attachments: HIVE-16047.1.patch, HIVE-16047.2.patch > > > Found lots of following errors in HS2 log: > {noformat} > hdfs.KeyProviderCache: Could not find uri with key > [dfs.encryption.key.provider.uri] to create a keyProvider !! > {noformat} > Similar to HDFS-7931 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-11133) Support hive.explain.user for Spark
[ https://issues.apache.org/jira/browse/HIVE-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980630#comment-15980630 ] Rui Li commented on HIVE-11133: --- Hi [~stakiar], in the query plan of your [comment|https://issues.apache.org/jira/browse/HIVE-11133?focusedCommentId=15979385&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15979385], why it only displays Reducer 11 for Stage-2? Is the output truncated? {noformat} Stage-2 Reducer 11 {noformat} > Support hive.explain.user for Spark > --- > > Key: HIVE-11133 > URL: https://issues.apache.org/jira/browse/HIVE-11133 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Mohit Sabharwal >Assignee: Sahil Takiar > Attachments: HIVE-11133.1.patch, HIVE-11133.2.patch, > HIVE-11133.3.patch, HIVE-11133.4.patch, HIVE-11133.5.patch, > HIVE-11133.6.patch, HIVE-11133.7.patch, HIVE-11133.8.patch > > > User friendly explain output ({{set hive.explain.user=true}}) should support > Spark as well. > Once supported, we should also enable related q-tests like {{explainuser_1.q}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16513) width_bucket issues
[ https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980618#comment-15980618 ] Carter Shanklin commented on HIVE-16513: Another update, the case expression thing turns out to be another divide by zero thing that got swallowed up. {code} h exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 3)) END, 10) java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 3)) END, 10) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:165) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2154) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 3)) END, 10) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:93) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:897) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:442) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:434) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147) ... 13 more Caused by: java.lang.ArithmeticException: / by zero at org.apache.hadoop.hive.ql.udf.generic.GenericUDFWidthBucket.evaluate(GenericUDFWidthBucket.java:75) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:187) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorHead._evaluate(ExprNodeEvaluatorHead.java:44) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:80) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:68) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) ... 18 more {code} > width_bucket issues > --- > > Key: HIVE-16513 > URL: https://issues.apache.org/jira/browse/HIVE-16513 > Project: Hive > Issue Type: Bug >Reporter: Carter Shanklin > > width_bucket was recently added with HIVE-15982. This ticket notes a few > issues. > Usability issue: > Currently only accepts integral numeric types. Decimals, floats and doubles > are not supported. > Runtime failures: This query will cause a runtime divide-by-zero in the > reduce stage. > select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1; > The divide-by-zero seems to trigger any time I use a group-by. Here's another > example (that actually requires the group-by): > select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1; > Advanced Usage Issues: > Suppose you have a table e011_01 as follows: > create table e011_01 (c1 integer, c2 smallint); > insert into e011_01 values (1, 1), (2, 2); > Compile-time problems: > You cannot use simple case expressions, searched case expressions or grouping > sets. These queries fail: > select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) > from e011_01; > select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) > from e011_01; > select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as > integer)) from e011_02 group by cube(c1, c2); > I'll admit the grouping one is pretty contrived but the case ones seem > straightforward, valid, and it's strange that they don't work. Similar > queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe > [~ashutoshc] can lend some perspective on that? >
[jira] [Commented] (HIVE-16510) Vectorization: Add vectorized PTF tests in preparation for HIVE-16369
[ https://issues.apache.org/jira/browse/HIVE-16510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980608#comment-15980608 ] Gopal V commented on HIVE-16510: The reduce vectorization is not kicking in for the .q.out files - approve of the patch, but this needs MiniLlapLocalDriver .q.out files generated. +1 pending those tests. > Vectorization: Add vectorized PTF tests in preparation for HIVE-16369 > - > > Key: HIVE-16510 > URL: https://issues.apache.org/jira/browse/HIVE-16510 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16510.01.patch > > > Had trouble with HIVE-16369 patch being blocked by Apache SPAM filters -- so > separating out adding vectorized versions of current windowing_*.q tests. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16503) LLAP: Oversubscribe memory for noconditional task size
[ https://issues.apache.org/jira/browse/HIVE-16503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-16503: - Attachment: HIVE-16503.2.patch > LLAP: Oversubscribe memory for noconditional task size > -- > > Key: HIVE-16503 > URL: https://issues.apache.org/jira/browse/HIVE-16503 > Project: Hive > Issue Type: Improvement > Components: llap >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-16503.1.patch, HIVE-16503.2.patch > > > When running map joins in llap, it can potentially use more memory for hash > table loading (assuming other executors in the daemons have some memory to > spare). This map join conversion decision has to be made during compilation > that can provide some more room for LLAP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16367) Null-safe equality <=> operator is not support with CBO
[ https://issues.apache.org/jira/browse/HIVE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16367: --- Status: Patch Available (was: Open) > Null-safe equality <=> operator is not support with CBO > > > Key: HIVE-16367 > URL: https://issues.apache.org/jira/browse/HIVE-16367 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16367.1.patch > > > Calcite doesn't support such equality operator so hive bails out and goes > through non-cbo path. This could restrict it's usage with subqueries and > other cbo only features. > Since {{<=>}} is equivalent to {{is not distinct from}} (HIVE-15986) we can > rewrite {{<=>}} to {{is not distinct from}} and enable CBO. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16367) Null-safe equality <=> operator is not support with CBO
[ https://issues.apache.org/jira/browse/HIVE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980585#comment-15980585 ] Vineet Garg commented on HIVE-16367: Attaching initial patch which enables CBO for {{<=>}} by rewriting it into {{is not distinct from}}. This patch has an issue where a join with {{<=>}} is now cross product instead of inner join. This needs to be investigated and fixed. > Null-safe equality <=> operator is not support with CBO > > > Key: HIVE-16367 > URL: https://issues.apache.org/jira/browse/HIVE-16367 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16367.1.patch > > > Calcite doesn't support such equality operator so hive bails out and goes > through non-cbo path. This could restrict it's usage with subqueries and > other cbo only features. > Since {{<=>}} is equivalent to {{is not distinct from}} (HIVE-15986) we can > rewrite {{<=>}} to {{is not distinct from}} and enable CBO. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16367) Null-safe equality <=> operator is not support with CBO
[ https://issues.apache.org/jira/browse/HIVE-16367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16367: --- Attachment: HIVE-16367.1.patch > Null-safe equality <=> operator is not support with CBO > > > Key: HIVE-16367 > URL: https://issues.apache.org/jira/browse/HIVE-16367 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-16367.1.patch > > > Calcite doesn't support such equality operator so hive bails out and goes > through non-cbo path. This could restrict it's usage with subqueries and > other cbo only features. > Since {{<=>}} is equivalent to {{is not distinct from}} (HIVE-15986) we can > rewrite {{<=>}} to {{is not distinct from}} and enable CBO. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15982) Support the width_bucket function
[ https://issues.apache.org/jira/browse/HIVE-15982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980578#comment-15980578 ] Carter Shanklin commented on HIVE-15982: I ran a test suite I use against this and noted a few issues in HIVE-16513. I think the biggest challenge will be that floating point numbers can't be used. I hope I didn't confuse the issue when I said "numeric value expressions" above. > Support the width_bucket function > - > > Key: HIVE-15982 > URL: https://issues.apache.org/jira/browse/HIVE-15982 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Sahil Takiar > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-15982.1.patch, HIVE-15982.2.patch, > HIVE-15982.3.patch, HIVE-15982.4.patch, HIVE-15982.5.patch, HIVE-15982.6.patch > > > Support the width_bucket(wbo, wbb1, wbb2, wbc) which returns an integer > between 0 and wbc+1 by mapping wbo into the ith equally sized bucket made by > dividing wbb1 and wbb2 into equally sized regions. If wbo < wbb1, return 1, > if wbo > wbb2 return wbc+1. Reference: SQL standard section 4.4. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16513) width_bucket issues
[ https://issues.apache.org/jira/browse/HIVE-16513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980577#comment-15980577 ] Carter Shanklin commented on HIVE-16513: Further note, the error you get for the case items is: select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) from e011_01; OK Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating width_bucket(5, c2, CASE WHEN ((c1 = 1)) THEN ((c1 * 2)) ELSE ((c1 * 3)) END, 10) Time taken: 0.087 seconds > width_bucket issues > --- > > Key: HIVE-16513 > URL: https://issues.apache.org/jira/browse/HIVE-16513 > Project: Hive > Issue Type: Bug >Reporter: Carter Shanklin > > width_bucket was recently added with HIVE-15982. This ticket notes a few > issues. > Usability issue: > Currently only accepts integral numeric types. Decimals, floats and doubles > are not supported. > Runtime failures: This query will cause a runtime divide-by-zero in the > reduce stage. > select width_bucket(c1, 0, c1*2, 10) from e011_01 group by c1; > The divide-by-zero seems to trigger any time I use a group-by. Here's another > example (that actually requires the group-by): > select width_bucket(c1, 0, max(c1), 10) from e011_01 group by c1; > Advanced Usage Issues: > Suppose you have a table e011_01 as follows: > create table e011_01 (c1 integer, c2 smallint); > insert into e011_01 values (1, 1), (2, 2); > Compile-time problems: > You cannot use simple case expressions, searched case expressions or grouping > sets. These queries fail: > select width_bucket(5, c2, case c1 when 1 then c1 * 2 else c1 * 3 end, 10) > from e011_01; > select width_bucket(5, c2, case when c1 < 2 then c1 * 2 else c1 * 3 end, 10) > from e011_01; > select width_bucket(5, c2, max(c1)*10, cast(grouping(c1, c2)*20+1 as > integer)) from e011_02 group by cube(c1, c2); > I'll admit the grouping one is pretty contrived but the case ones seem > straightforward, valid, and it's strange that they don't work. Similar > queries work with other UDFs like sum. Why wouldn't they "just work"? Maybe > [~ashutoshc] can lend some perspective on that? > Interestingly, you can use window functions in width_bucket, example: > select width_bucket(rank() over (order by c2), 0, 10, 10) from e011_01; > works just fine. Hopefully we can get to a place where people implementing > functions like this don't need to think about value expression support but we > don't seem to be there yet. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats
[ https://issues.apache.org/jira/browse/HIVE-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16495: --- Status: Patch Available (was: Open) > ColumnStats merge should consider the accuracy of the current stats > --- > > Key: HIVE-16495 > URL: https://issues.apache.org/jira/browse/HIVE-16495 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16495.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats
[ https://issues.apache.org/jira/browse/HIVE-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16495: --- Attachment: HIVE-16495.01.patch > ColumnStats merge should consider the accuracy of the current stats > --- > > Key: HIVE-16495 > URL: https://issues.apache.org/jira/browse/HIVE-16495 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-16495.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16366) Hive 2.3 release planning
[ https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16366: --- Status: Patch Available (was: Open) > Hive 2.3 release planning > - > > Key: HIVE-16366 > URL: https://issues.apache.org/jira/browse/HIVE-16366 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Blocker > Labels: 2.3.0 > Fix For: 2.3.0 > > Attachments: HIVE-16366-branch-2.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16366) Hive 2.3 release planning
[ https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16366: --- Status: Open (was: Patch Available) > Hive 2.3 release planning > - > > Key: HIVE-16366 > URL: https://issues.apache.org/jira/browse/HIVE-16366 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Blocker > Labels: 2.3.0 > Fix For: 2.3.0 > > Attachments: HIVE-16366-branch-2.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16366) Hive 2.3 release planning
[ https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16366: --- Attachment: HIVE-16366-branch-2.3.patch > Hive 2.3 release planning > - > > Key: HIVE-16366 > URL: https://issues.apache.org/jira/browse/HIVE-16366 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Blocker > Labels: 2.3.0 > Fix For: 2.3.0 > > Attachments: HIVE-16366-branch-2.3.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16366) Hive 2.3 release planning
[ https://issues.apache.org/jira/browse/HIVE-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-16366: --- Attachment: (was: HIVE-16366-branch-2.3.patch) > Hive 2.3 release planning > - > > Key: HIVE-16366 > URL: https://issues.apache.org/jira/browse/HIVE-16366 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Blocker > Labels: 2.3.0 > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects
[ https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980455#comment-15980455 ] Hive QA commented on HIVE-16079: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864561/HIVE-16079.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10628 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] (batchId=225) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4852/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4852/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4852/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12864561 - PreCommit-HIVE-Build > HS2: high memory pressure due to duplicate Properties objects > - > > Key: HIVE-16079 > URL: https://issues.apache.org/jira/browse/HIVE-16079 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, > HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt > > > I've created a Hive table with 2000 partitions, each backed by two files, > with one row in each file. When I execute some number of concurrent queries > against this table, e.g. as follows > {code} > for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p > admin -e "select count(i_f_1) from misha_table;" & done > {code} > it results in a big memory spike. With 20 queries I caused an OOM in a HS2 > server with -Xmx200m and with 50 queries - in the one with -Xmx500m. > I am attaching the results of jxray (www.jxray.com) analysis of a heap dump > that was generated in the 50queries/500m heap scenario. It suggests that > there are several opportunities to reduce memory pressure with not very > invasive changes to the code. One (duplicate strings) has been addressed in > https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going > to address the fact that almost 20% of memory is used by instances of > java.util.Properties. These objects are highly duplicate, since for each > partition each concurrently running query creates its own copy of Partion, > PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 > partitions) Properties in memory. By interning/deduplicating these objects we > may be able to save perhaps 15% of memory. > Note, however, that if there are queries that mutate partitions, the > corresponding Properties would be mutated as well. Thus we cannot simply use > a single "canonicalized" Properties object at all times for all Partition > objects representing the same DB partition. Instead, I am going to introduce > a special CopyOnFirstWriteProperties class. Such an object initially > internally references a canonicalized Properties object, and keeps doing so > while only read methods are called. However, once any mutating method is > called, the given CopyOnFirstWriteProperties copies the data into its own > table from the canonicalized table, and uses it ever after. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Issue Comment Deleted] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects
[ https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-16079: --- Comment: was deleted (was: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864561/HIVE-16079.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10626 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] (batchId=225) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=96) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4834/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4834/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4834/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12864561 - PreCommit-HIVE-Build) > HS2: high memory pressure due to duplicate Properties objects > - > > Key: HIVE-16079 > URL: https://issues.apache.org/jira/browse/HIVE-16079 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, > HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt > > > I've created a Hive table with 2000 partitions, each backed by two files, > with one row in each file. When I execute some number of concurrent queries > against this table, e.g. as follows > {code} > for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p > admin -e "select count(i_f_1) from misha_table;" & done > {code} > it results in a big memory spike. With 20 queries I caused an OOM in a HS2 > server with -Xmx200m and with 50 queries - in the one with -Xmx500m. > I am attaching the results of jxray (www.jxray.com) analysis of a heap dump > that was generated in the 50queries/500m heap scenario. It suggests that > there are several opportunities to reduce memory pressure with not very > invasive changes to the code. One (duplicate strings) has been addressed in > https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going > to address the fact that almost 20% of memory is used by instances of > java.util.Properties. These objects are highly duplicate, since for each > partition each concurrently running query creates its own copy of Partion, > PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 > partitions) Properties in memory. By interning/deduplicating these objects we > may be able to save perhaps 15% of memory. > Note, however, that if there are queries that mutate partitions, the > corresponding Properties would be mutated as well. Thus we cannot simply use > a single "canonicalized" Properties object at all times for all Partition > objects representing the same DB partition. Instead, I am going to introduce > a special CopyOnFirstWriteProperties class. Such an object initially > internally references a canonicalized Properties object, and keeps doing so > while only read methods are called. However, once any mutating method is > called, the given CopyOnFirstWriteProperties copies the data into its own > table from the canonicalized table, and uses it ever after. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-15229) 'like any' and 'like all' operators in hive
[ https://issues.apache.org/jira/browse/HIVE-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980314#comment-15980314 ] Simanchal Das commented on HIVE-15229: -- Hi [~cwsteinbach] I have refreshed the patch with latest code. > 'like any' and 'like all' operators in hive > --- > > Key: HIVE-15229 > URL: https://issues.apache.org/jira/browse/HIVE-15229 > Project: Hive > Issue Type: New Feature > Components: Operators >Reporter: Simanchal Das >Assignee: Simanchal Das >Priority: Minor > Attachments: HIVE-15229.1.patch, HIVE-15229.2.patch, > HIVE-15229.3.patch, HIVE-15229.4.patch, HIVE-15229.5.patch, HIVE-15229.6.patch > > > In Teradata 'like any' and 'like all' operators are mostly used when we are > matching a text field with numbers of patterns. > 'like any' and 'like all' operator are equivalents of multiple like operator > like example below. > {noformat} > --like any > select col1 from table1 where col2 like any ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like condition > select col1 from table1 where col2 like '%accountant%' or col2 like > '%accounting%' or col2 like '%retail%' or col2 like '%bank%' or col2 like > '%insurance%' ; > --like all > select col1 from table1 where col2 like all ('%accountant%', '%accounting%', > '%retail%', '%bank%', '%insurance%'); > --Can be written using multiple like operator > select col1 from table1 where col2 like '%accountant%' and col2 like > '%accounting%' and col2 like '%retail%' and col2 like '%bank%' and col2 like > '%insurance%' ; > {noformat} > Problem statement: > Now a days so many data warehouse projects are being migrated from Teradata > to Hive. > Always Data engineer and Business analyst are searching for these two > operator. > If we introduce these two operator in hive then so many scripts will be > migrated smoothly instead of converting these operators to multiple like > operators. > Result: > 1. 'LIKE ANY' operator return true if a text(column value) matches to any > pattern. > 2. 'LIKE ALL' operator return true if a text(column value) matches to all > patterns. > 3. 'LIKE ANY' and 'LIKE ALL' returns NULL not only if the expression on the > left hand side is NULL, but also if one of the pattern in the list is NULL. -- This message was sent by Atlassian JIRA (v6.3.15#6346)