[jira] [Commented] (HIVE-13988) zero length file is being created for empty bucket in tez mode (I)
[ https://issues.apache.org/jira/browse/HIVE-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345425#comment-15345425 ] Pengcheng Xiong commented on HIVE-13988: [~ashutoshc], your comments are valid. Could u take another look? I tried to only use move task but it seems more complicated than i thought. Move task is followed by stats task and we also need to make stats work. Thus, I only make very limited optimization, i.e., when there is only one "insert into", we skip the task compilation. Please see attached q files for examples. > zero length file is being created for empty bucket in tez mode (I) > -- > > Key: HIVE-13988 > URL: https://issues.apache.org/jira/browse/HIVE-13988 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13988.01.patch, HIVE-13988.02.patch > > > Even though bucket is empty, zero length file is being created in tez mode. > steps to reproduce the issue: > {noformat} > hive> set hive.execution.engine; > hive.execution.engine=tez > hive> drop table if exists emptybucket_orc; > OK > Time taken: 5.416 seconds > hive> create table emptybucket_orc(age int) clustered by (age) sorted by > (age) into 99 buckets stored as orc; > OK > Time taken: 0.493 seconds > hive> insert into table emptybucket_orc select distinct(age) from > studenttab10k limit 0; > Query ID = hrt_qa_20160523231955_8b981be7-68c4-4416-8a48-5f8c7ff551c3 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1464045121842_0002) > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 2 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 3 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 4 .. llap SUCCEEDED 99 9900 > 0 0 > -- > VERTICES: 04/04 [==>>] 100% ELAPSED TIME: 11.00 s > > -- > Loading data to table default.emptybucket_orc > OK > Time taken: 16.907 seconds > hive> dfs -ls /apps/hive/warehouse/emptybucket_orc; > Found 99 items > -rwxrwxrwx 3 hrt_qa hdfs 0 2016-05-23 23:20 > /apps/hive/warehouse/emptybucket_orc/00_0 > -rwxrwxrwx 3 hrt_qa hdfs 0 2016-05-23 23:20 > /apps/hive/warehouse/emptybucket_orc/01_0 > .. > {noformat} > Expected behavior: > In tez mode, zero length file shouldn't get created on hdfs if bucket is empty -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11527) bypass HiveServer2 thrift interface for query results
[ https://issues.apache.org/jira/browse/HIVE-11527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345252#comment-15345252 ] Thejas M Nair commented on HIVE-11527: -- [~tasanuma0829] I think it makes sense to explore if we can extend typeDesc to encode the fully information for complex types. Otherwise, we have duplicate information being sent. ([~prasadm] [~cwsteinbach] please chime in if you have tried that ). The serliazation format is hardcoded in this case. If we use a single serde format, using LazyBinarySerDe instead of LazySimpleSerde is likely to be more performant IMO. [~gopalv] What are your thoughts on that ? > bypass HiveServer2 thrift interface for query results > - > > Key: HIVE-11527 > URL: https://issues.apache.org/jira/browse/HIVE-11527 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Sergey Shelukhin >Assignee: Takanobu Asanuma > Attachments: HIVE-11527.10.patch, HIVE-11527.11.patch, > HIVE-11527.WIP.patch > > > Right now, HS2 reads query results and returns them to the caller via its > thrift API. > There should be an option for HS2 to return some pointer to results (an HDFS > link?) and for the user to read the results directly off HDFS inside the > cluster, or via something like WebHDFS outside the cluster > Review board link: https://reviews.apache.org/r/40867 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong performance numbers on HS2
[ https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14070: --- Status: Patch Available (was: Open) > hive.tez.exec.print.summary=true returns wrong performance numbers on HS2 > - > > Key: HIVE-14070 > URL: https://issues.apache.org/jira/browse/HIVE-14070 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch, > HIVE-14070.03.patch > > > On master, we have > {code} > Query Execution Summary > -- > OPERATIONDURATION > -- > Compile Query -1466208820.74s > Prepare Plan0.00s > Submit Plan 1466208825.50s > Start DAG 0.26s > Run DAG 4.39s > -- > Task Execution Summary > -- > VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS > OUTPUT_RECORDS > -- > Map 11014.00 1,534 11 1,500 > 1 > Reducer 2 96.00 5410 1 > 0 > -- > {code} > sounds like a real issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong performance numbers on HS2
[ https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14070: --- Status: Open (was: Patch Available) > hive.tez.exec.print.summary=true returns wrong performance numbers on HS2 > - > > Key: HIVE-14070 > URL: https://issues.apache.org/jira/browse/HIVE-14070 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch, > HIVE-14070.03.patch > > > On master, we have > {code} > Query Execution Summary > -- > OPERATIONDURATION > -- > Compile Query -1466208820.74s > Prepare Plan0.00s > Submit Plan 1466208825.50s > Start DAG 0.26s > Run DAG 4.39s > -- > Task Execution Summary > -- > VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS > OUTPUT_RECORDS > -- > Map 11014.00 1,534 11 1,500 > 1 > Reducer 2 96.00 5410 1 > 0 > -- > {code} > sounds like a real issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong performance numbers on HS2
[ https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14070: --- Attachment: HIVE-14070.03.patch > hive.tez.exec.print.summary=true returns wrong performance numbers on HS2 > - > > Key: HIVE-14070 > URL: https://issues.apache.org/jira/browse/HIVE-14070 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch, > HIVE-14070.03.patch > > > On master, we have > {code} > Query Execution Summary > -- > OPERATIONDURATION > -- > Compile Query -1466208820.74s > Prepare Plan0.00s > Submit Plan 1466208825.50s > Start DAG 0.26s > Run DAG 4.39s > -- > Task Execution Summary > -- > VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS > OUTPUT_RECORDS > -- > Map 11014.00 1,534 11 1,500 > 1 > Reducer 2 96.00 5410 1 > 0 > -- > {code} > sounds like a real issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14055) directSql - getting the number of partitions is broken
[ https://issues.apache.org/jira/browse/HIVE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14055: Resolution: Fixed Fix Version/s: 2.1.1 2.2.0 Status: Resolved (was: Patch Available) Looks like the original patch is not in Hive 1. Committed to all the relevant branches. Thanks for the review! > directSql - getting the number of partitions is broken > -- > > Key: HIVE-14055 > URL: https://issues.apache.org/jira/browse/HIVE-14055 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14055.01.patch, HIVE-14055.02.patch, > HIVE-14055.patch > > > Noticed while looking at something else. If the filter cannot be pushed down > it just returns 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14077) revert or fix HIVE-13380
[ https://issues.apache.org/jira/browse/HIVE-14077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345490#comment-15345490 ] Sergey Shelukhin commented on HIVE-14077: - I'll use this JIRA to add a test. > revert or fix HIVE-13380 > > > Key: HIVE-14077 > URL: https://issues.apache.org/jira/browse/HIVE-14077 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Blocker > > See comments in that JIRA -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong performance numbers on HS2
[ https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345511#comment-15345511 ] Pengcheng Xiong commented on HIVE-14070: address [~thejas]'s comments and test failures. > hive.tez.exec.print.summary=true returns wrong performance numbers on HS2 > - > > Key: HIVE-14070 > URL: https://issues.apache.org/jira/browse/HIVE-14070 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch, > HIVE-14070.03.patch > > > On master, we have > {code} > Query Execution Summary > -- > OPERATIONDURATION > -- > Compile Query -1466208820.74s > Prepare Plan0.00s > Submit Plan 1466208825.50s > Start DAG 0.26s > Run DAG 4.39s > -- > Task Execution Summary > -- > VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS > OUTPUT_RECORDS > -- > Map 11014.00 1,534 11 1,500 > 1 > Reducer 2 96.00 5410 1 > 0 > -- > {code} > sounds like a real issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14068) make more effort to find hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14068: Attachment: HIVE-14068.02.patch Updated... [~thejas] I bet those things could be fixed on commit :P > make more effort to find hive-site.xml > -- > > Key: HIVE-14068 > URL: https://issues.apache.org/jira/browse/HIVE-14068 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14068.01.patch, HIVE-14068.02.patch, > HIVE-14068.patch > > > It pretty much doesn't make sense to run Hive w/o the config, so we should > make more effort to find one if it's missing on the classpath, or the > classloader does not return it for some reason (e.g. classloader ignores some > permission issues; explicitly looking for the file may expose them better) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14071) HIVE-14014 breaks non-file outputs
[ https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14071: Resolution: Fixed Fix Version/s: 2.1.1 2.2.0 Status: Resolved (was: Patch Available) Committed to branches; thanks for the review! > HIVE-14014 breaks non-file outputs > -- > > Key: HIVE-14071 > URL: https://issues.apache.org/jira/browse/HIVE-14071 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.2.0, 2.1.1 > > Attachments: HIVE-14071.patch, HIVE-14071.patch > > > Cannot avoid creating outputs when outputs are e.g. streaming -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13617) LLAP: support non-vectorized execution in IO
[ https://issues.apache.org/jira/browse/HIVE-13617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13617: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the review! > LLAP: support non-vectorized execution in IO > > > Key: HIVE-13617 > URL: https://issues.apache.org/jira/browse/HIVE-13617 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.2.0 > > Attachments: HIVE-13617-wo-11417.patch, HIVE-13617-wo-11417.patch, > HIVE-13617.01.patch, HIVE-13617.03.patch, HIVE-13617.04.patch, > HIVE-13617.05.patch, HIVE-13617.06.patch, HIVE-13617.patch, HIVE-13617.patch, > HIVE-15396-with-oi.patch > > > Two approaches - a separate decoding path, into rows instead of VRBs; or > decoding VRBs into rows on a higher level (the original LlapInputFormat). I > think the latter might be better - it's not a hugely important path, and perf > in non-vectorized case is not the best anyway, so it's better to make do with > much less new code and architectural disruption. > Some ORC patches in progress introduce an easy to reuse (or so I hope, > anyway) VRB-to-row conversion, so we should just use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong performance numbers on HS2
[ https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345567#comment-15345567 ] Pengcheng Xiong commented on HIVE-14070: [~prasanth_j], could u take a final look? Thanks. > hive.tez.exec.print.summary=true returns wrong performance numbers on HS2 > - > > Key: HIVE-14070 > URL: https://issues.apache.org/jira/browse/HIVE-14070 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch, > HIVE-14070.03.patch > > > On master, we have > {code} > Query Execution Summary > -- > OPERATIONDURATION > -- > Compile Query -1466208820.74s > Prepare Plan0.00s > Submit Plan 1466208825.50s > Start DAG 0.26s > Run DAG 4.39s > -- > Task Execution Summary > -- > VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS > OUTPUT_RECORDS > -- > Map 11014.00 1,534 11 1,500 > 1 > Reducer 2 96.00 5410 1 > 0 > -- > {code} > sounds like a real issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13988) zero length file is being created for empty bucket in tez mode (I)
[ https://issues.apache.org/jira/browse/HIVE-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13988: --- Status: Patch Available (was: Open) > zero length file is being created for empty bucket in tez mode (I) > -- > > Key: HIVE-13988 > URL: https://issues.apache.org/jira/browse/HIVE-13988 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13988.01.patch, HIVE-13988.02.patch > > > Even though bucket is empty, zero length file is being created in tez mode. > steps to reproduce the issue: > {noformat} > hive> set hive.execution.engine; > hive.execution.engine=tez > hive> drop table if exists emptybucket_orc; > OK > Time taken: 5.416 seconds > hive> create table emptybucket_orc(age int) clustered by (age) sorted by > (age) into 99 buckets stored as orc; > OK > Time taken: 0.493 seconds > hive> insert into table emptybucket_orc select distinct(age) from > studenttab10k limit 0; > Query ID = hrt_qa_20160523231955_8b981be7-68c4-4416-8a48-5f8c7ff551c3 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1464045121842_0002) > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 2 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 3 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 4 .. llap SUCCEEDED 99 9900 > 0 0 > -- > VERTICES: 04/04 [==>>] 100% ELAPSED TIME: 11.00 s > > -- > Loading data to table default.emptybucket_orc > OK > Time taken: 16.907 seconds > hive> dfs -ls /apps/hive/warehouse/emptybucket_orc; > Found 99 items > -rwxrwxrwx 3 hrt_qa hdfs 0 2016-05-23 23:20 > /apps/hive/warehouse/emptybucket_orc/00_0 > -rwxrwxrwx 3 hrt_qa hdfs 0 2016-05-23 23:20 > /apps/hive/warehouse/emptybucket_orc/01_0 > .. > {noformat} > Expected behavior: > In tez mode, zero length file shouldn't get created on hdfs if bucket is empty -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13988) zero length file is being created for empty bucket in tez mode (I)
[ https://issues.apache.org/jira/browse/HIVE-13988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13988: --- Status: Open (was: Patch Available) > zero length file is being created for empty bucket in tez mode (I) > -- > > Key: HIVE-13988 > URL: https://issues.apache.org/jira/browse/HIVE-13988 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13988.01.patch, HIVE-13988.02.patch > > > Even though bucket is empty, zero length file is being created in tez mode. > steps to reproduce the issue: > {noformat} > hive> set hive.execution.engine; > hive.execution.engine=tez > hive> drop table if exists emptybucket_orc; > OK > Time taken: 5.416 seconds > hive> create table emptybucket_orc(age int) clustered by (age) sorted by > (age) into 99 buckets stored as orc; > OK > Time taken: 0.493 seconds > hive> insert into table emptybucket_orc select distinct(age) from > studenttab10k limit 0; > Query ID = hrt_qa_20160523231955_8b981be7-68c4-4416-8a48-5f8c7ff551c3 > Total jobs = 1 > Launching Job 1 out of 1 > Status: Running (Executing on YARN cluster with App id > application_1464045121842_0002) > -- > VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING > FAILED KILLED > -- > Map 1 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 2 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 3 .. llap SUCCEEDED 1 100 > 0 0 > Reducer 4 .. llap SUCCEEDED 99 9900 > 0 0 > -- > VERTICES: 04/04 [==>>] 100% ELAPSED TIME: 11.00 s > > -- > Loading data to table default.emptybucket_orc > OK > Time taken: 16.907 seconds > hive> dfs -ls /apps/hive/warehouse/emptybucket_orc; > Found 99 items > -rwxrwxrwx 3 hrt_qa hdfs 0 2016-05-23 23:20 > /apps/hive/warehouse/emptybucket_orc/00_0 > -rwxrwxrwx 3 hrt_qa hdfs 0 2016-05-23 23:20 > /apps/hive/warehouse/emptybucket_orc/01_0 > .. > {noformat} > Expected behavior: > In tez mode, zero length file shouldn't get created on hdfs if bucket is empty -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14074) RELOAD FUNCTION should update dropped functions
[ https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345654#comment-15345654 ] Hive QA commented on HIVE-14074: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12812632/HIVE-14074.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 232 failed/errored test(s), 10258 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_acid_globallimit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_2_orc org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_orc org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby_empty org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_limit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_semijoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_stats org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_exists org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_not_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_udf_udaf org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_union org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_views org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_windowing org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_mat_4 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_mat_5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_non_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_orig_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_tmp_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_no_match org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_non_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_whole_partition org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_4 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_orig_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_update_delete org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llapdecider org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_decimal org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge10 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge11 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge12 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge4 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge6 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge7
[jira] [Commented] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong performance numbers on HS2
[ https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345557#comment-15345557 ] Hive QA commented on HIVE-14070: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12812394/HIVE-14070.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10257 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/225/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/225/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-225/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12812394 - PreCommit-HIVE-MASTER-Build > hive.tez.exec.print.summary=true returns wrong performance numbers on HS2 > - > > Key: HIVE-14070 > URL: https://issues.apache.org/jira/browse/HIVE-14070 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch, > HIVE-14070.03.patch > > > On master, we have > {code} > Query Execution Summary > -- > OPERATIONDURATION > -- > Compile Query -1466208820.74s > Prepare Plan0.00s > Submit Plan 1466208825.50s > Start DAG 0.26s > Run DAG 4.39s > -- > Task Execution Summary > -- > VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS > OUTPUT_RECORDS > -- > Map 11014.00 1,534 11 1,500 > 1 > Reducer 2 96.00 5410 1 > 0 > -- > {code} > sounds like a real issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13872: Attachment: HIVE-13872.05.patch > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Matt McCline > Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, > HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.05.patch, > HIVE-13872.WIP.patch, customer_demographics.txt, vector_include_no_sel.q, > vector_include_no_sel.q.out > > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} > {code} > Map 3 > Map Operator Tree: > TableScan > alias: customer_demographics > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > value expressions: cd_demo_sk (type: int), > cd_marital_status (type: string) > Execution mode: vectorized, llap > LLAP IO: all inputs > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13872: Attachment: HIVE-13872.05.patch > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Matt McCline > Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, > HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.05.patch, > HIVE-13872.WIP.patch, customer_demographics.txt, vector_include_no_sel.q, > vector_include_no_sel.q.out > > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} > {code} > Map 3 > Map Operator Tree: > TableScan > alias: customer_demographics > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > value expressions: cd_demo_sk (type: int), > cd_marital_status (type: string) > Execution mode: vectorized, llap > LLAP IO: all inputs > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13872: Status: In Progress (was: Patch Available) > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Matt McCline > Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, > HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.05.patch, > HIVE-13872.WIP.patch, customer_demographics.txt, vector_include_no_sel.q, > vector_include_no_sel.q.out > > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} > {code} > Map 3 > Map Operator Tree: > TableScan > alias: customer_demographics > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > value expressions: cd_demo_sk (type: int), > cd_marital_status (type: string) > Execution mode: vectorized, llap > LLAP IO: all inputs > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13872: Attachment: (was: HIVE-13872.05.patch) > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Matt McCline > Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, > HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.05.patch, > HIVE-13872.WIP.patch, customer_demographics.txt, vector_include_no_sel.q, > vector_include_no_sel.q.out > > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} > {code} > Map 3 > Map Operator Tree: > TableScan > alias: customer_demographics > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > value expressions: cd_demo_sk (type: int), > cd_marital_status (type: string) > Execution mode: vectorized, llap > LLAP IO: all inputs > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13872: Status: Patch Available (was: In Progress) > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Matt McCline > Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, > HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.05.patch, > HIVE-13872.WIP.patch, customer_demographics.txt, vector_include_no_sel.q, > vector_include_no_sel.q.out > > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} > {code} > Map 3 > Map Operator Tree: > TableScan > alias: customer_demographics > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > value expressions: cd_demo_sk (type: int), > cd_marital_status (type: string) > Execution mode: vectorized, llap > LLAP IO: all inputs > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization
[ https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345690#comment-15345690 ] Matt McCline commented on HIVE-13872: - Thank you Gopal for the review! > Vectorization: Fix cross-product reduce sink serialization > -- > > Key: HIVE-13872 > URL: https://issues.apache.org/jira/browse/HIVE-13872 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Matt McCline > Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, > HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.05.patch, > HIVE-13872.WIP.patch, customer_demographics.txt, vector_include_no_sel.q, > vector_include_no_sel.q.out > > > TPC-DS Q13 produces a cross-product without CBO simplifying the query > {code} > Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 > projection column num 1 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343) > at > org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130) > at > org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762) > ... 18 more > {code} > Simplified query > {code} > set hive.cbo.enable=false; > -- explain > select count(1) > from store_sales > ,customer_demographics > where ( > ( > customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk > and customer_demographics.cd_marital_status = 'M' > )or > ( >customer_demographics.cd_demo_sk = ss_cdemo_sk > and customer_demographics.cd_marital_status = 'U' > )) > ; > {code} > {code} > Map 3 > Map Operator Tree: > TableScan > alias: customer_demographics > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1920800 Data size: 717255532 Basic > stats: COMPLETE Column stats: NONE > value expressions: cd_demo_sk (type: int), > cd_marital_status (type: string) > Execution mode: vectorized, llap > LLAP IO: all inputs > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13636) Exception using Postgres as metastore with ACID transanctions enabled
[ https://issues.apache.org/jira/browse/HIVE-13636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345738#comment-15345738 ] Rajkumar Singh commented on HIVE-13636: --- [~mgaido] it seems that there is no issue with hive here, you are using old postgress jdbc3 driver which is throwing AbstractMethodError, could you please update to jdbc4 driver and see if you still face this issue. > Exception using Postgres as metastore with ACID transanctions enabled > - > > Key: HIVE-13636 > URL: https://issues.apache.org/jira/browse/HIVE-13636 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.2.1 > Environment: HDP 2.3.2 >Reporter: Marco Gaido > > We are using Postgres as metastore and we enabled ACID transactions. Once we > have done this, we started facing this error: > {code} > FATAL [DeadTxnReaper-0]: txn.AcidHouseKeeperService > (AcidHouseKeeperService.java:run(92)) - Serious error in DeadTxnReaper-0: > Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract > java.lang.AbstractMethodError: Method > org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract > at org.postgresql.jdbc3.Jdbc3ResultSet.isClosed(Jdbc3ResultSet.java) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.close(TxnHandler.java:934) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.close(TxnHandler.java:947) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.performTimeOuts(TxnHandler.java:1933) > at > org.apache.hadoop.hive.ql.txn.AcidHouseKeeperService$TimedoutTxnReaper.run(AcidHouseKeeperService.java:87) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Looking at the code of the class TxnHandler, in the method close is actually > used the isClosed() method on the ResultSet class, which is not implemented > in Jdbc3ResultSet Postgres driver's class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14078) LLAP input split should get task attempt number from conf if available
[ https://issues.apache.org/jira/browse/HIVE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-14078: -- Attachment: HIVE-14078.1.patch > LLAP input split should get task attempt number from conf if available > -- > > Key: HIVE-14078 > URL: https://issues.apache.org/jira/browse/HIVE-14078 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-14078.1.patch > > > Currently the attempt number is hard-coded to 0. If the split is being > fetched as part of a hadoop job we can get the task attempt ID from the conf > if it has been set, and use the attempt number from that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14078) LLAP input split should get task attempt number from conf if available
[ https://issues.apache.org/jira/browse/HIVE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-14078: -- Status: Patch Available (was: Open) > LLAP input split should get task attempt number from conf if available > -- > > Key: HIVE-14078 > URL: https://issues.apache.org/jira/browse/HIVE-14078 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-14078.1.patch > > > Currently the attempt number is hard-coded to 0. If the split is being > fetched as part of a hadoop job we can get the task attempt ID from the conf > if it has been set, and use the attempt number from that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14028) stats is not updated
[ https://issues.apache.org/jira/browse/HIVE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14028: --- Status: Patch Available (was: Open) > stats is not updated > > > Key: HIVE-14028 > URL: https://issues.apache.org/jira/browse/HIVE-14028 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14028.01.patch, HIVE-14028.02.patch > > > {code} > DROP TABLE users; > CREATE TABLE users(key string, state string, country string, country_id int) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ( > "hbase.columns.mapping" = "info:state,info:country,info:country_id" > ); > INSERT OVERWRITE TABLE users SELECT 'user1', 'IA', 'USA', 0 FROM src; > desc formatted users; > {code} > the result is > {code} > A masked pattern was here > Table Type: MANAGED_TABLE > Table Parameters: > COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"} > numFiles0 > numRows 0 > rawDataSize 0 > storage_handler > org.apache.hadoop.hive.hbase.HBaseStorageHandler > totalSize 0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14028) stats is not updated
[ https://issues.apache.org/jira/browse/HIVE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14028: --- Attachment: HIVE-14028.02.patch > stats is not updated > > > Key: HIVE-14028 > URL: https://issues.apache.org/jira/browse/HIVE-14028 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14028.01.patch, HIVE-14028.02.patch > > > {code} > DROP TABLE users; > CREATE TABLE users(key string, state string, country string, country_id int) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ( > "hbase.columns.mapping" = "info:state,info:country,info:country_id" > ); > INSERT OVERWRITE TABLE users SELECT 'user1', 'IA', 'USA', 0 FROM src; > desc formatted users; > {code} > the result is > {code} > A masked pattern was here > Table Type: MANAGED_TABLE > Table Parameters: > COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"} > numFiles0 > numRows 0 > rawDataSize 0 > storage_handler > org.apache.hadoop.hive.hbase.HBaseStorageHandler > totalSize 0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14028) stats is not updated
[ https://issues.apache.org/jira/browse/HIVE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14028: --- Status: Open (was: Patch Available) > stats is not updated > > > Key: HIVE-14028 > URL: https://issues.apache.org/jira/browse/HIVE-14028 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14028.01.patch, HIVE-14028.02.patch > > > {code} > DROP TABLE users; > CREATE TABLE users(key string, state string, country string, country_id int) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ( > "hbase.columns.mapping" = "info:state,info:country,info:country_id" > ); > INSERT OVERWRITE TABLE users SELECT 'user1', 'IA', 'USA', 0 FROM src; > desc formatted users; > {code} > the result is > {code} > A masked pattern was here > Table Type: MANAGED_TABLE > Table Parameters: > COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"} > numFiles0 > numRows 0 > rawDataSize 0 > storage_handler > org.apache.hadoop.hive.hbase.HBaseStorageHandler > totalSize 0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14078) LLAP input split should get task attempt number from conf if available
[ https://issues.apache.org/jira/browse/HIVE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-14078: -- Issue Type: Sub-task (was: Bug) Parent: HIVE-12991 > LLAP input split should get task attempt number from conf if available > -- > > Key: HIVE-14078 > URL: https://issues.apache.org/jira/browse/HIVE-14078 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-14078.1.patch > > > Currently the attempt number is hard-coded to 0. If the split is being > fetched as part of a hadoop job we can get the task attempt ID from the conf > if it has been set, and use the attempt number from that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13982) Extensions to RS dedup: execute with different column order and sorting direction if possible
[ https://issues.apache.org/jira/browse/HIVE-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13982: --- Attachment: (was: HIVE-13982.6.patch) > Extensions to RS dedup: execute with different column order and sorting > direction if possible > - > > Key: HIVE-13982 > URL: https://issues.apache.org/jira/browse/HIVE-13982 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13982.2.patch, HIVE-13982.3.patch, > HIVE-13982.4.patch, HIVE-13982.5.patch, HIVE-13982.patch > > > Pointed out by [~gopalv]. > RS dedup should kick in for these cases, avoiding an additional shuffle stage. > {code} > select state, city, sum(sales) from table > group by state, city > order by state, city > limit 10; > {code} > {code} > select state, city, sum(sales) from table > group by city, state > order by state, city > limit 10; > {code} > {code} > select state, city, sum(sales) from table > group by city, state > order by state desc, city > limit 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13982) Extensions to RS dedup: execute with different column order and sorting direction if possible
[ https://issues.apache.org/jira/browse/HIVE-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-13982: --- Attachment: HIVE-13982.6.patch Updated the patch to fix the regression; it had to do with windowing. For now, we do not support reordering of partitioning/ordering within windowing. > Extensions to RS dedup: execute with different column order and sorting > direction if possible > - > > Key: HIVE-13982 > URL: https://issues.apache.org/jira/browse/HIVE-13982 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 2.2.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-13982.2.patch, HIVE-13982.3.patch, > HIVE-13982.4.patch, HIVE-13982.5.patch, HIVE-13982.6.patch, HIVE-13982.patch > > > Pointed out by [~gopalv]. > RS dedup should kick in for these cases, avoiding an additional shuffle stage. > {code} > select state, city, sum(sales) from table > group by state, city > order by state, city > limit 10; > {code} > {code} > select state, city, sum(sales) from table > group by city, state > order by state, city > limit 10; > {code} > {code} > select state, city, sum(sales) from table > group by city, state > order by state desc, city > limit 10; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13756) Map failure attempts to delete reducer _temporary directory on multi-query pig query
[ https://issues.apache.org/jira/browse/HIVE-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13756: --- Attachment: HIVE-13756.1.patch HIVE-13756.1-branch-1.patch > Map failure attempts to delete reducer _temporary directory on multi-query > pig query > > > Key: HIVE-13756 > URL: https://issues.apache.org/jira/browse/HIVE-13756 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 1.2.1, 2.0.0 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13756-branch-1.patch, HIVE-13756.1-branch-1.patch, > HIVE-13756.1.patch, HIVE-13756.patch > > > A pig script, executed with multi-query enabled, that reads the source data > and writes it as-is into TABLE_A as well as performing a group-by operation > on the data which is written into TABLE_B can produce erroneous results if > any map fails. This results in a single MR job that writes the map output to > a scratch directory relative to TABLE_A and the reducer output to a scratch > directory relative to TABLE_B. > If one or more maps fail it will delete the attempt data relative to TABLE_A, > but it also deletes the _temporary directory relative to TABLE_B. This has > the unintended side-effect of preventing subsequent maps from committing > their data. This means that any maps which successfully completed before the > first map failure will have its data committed as expected, other maps not, > resulting in an incomplete result set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13754) Fix resource leak in HiveClientCache
[ https://issues.apache.org/jira/browse/HIVE-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13754: --- Attachment: HIVE-13754.1.patch HIVE-13754.1-branch-1.patch > Fix resource leak in HiveClientCache > > > Key: HIVE-13754 > URL: https://issues.apache.org/jira/browse/HIVE-13754 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1, 2.0.0 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13754-branch-1.patch, HIVE-13754.1-branch-1.patch, > HIVE-13754.1.patch, HIVE-13754.patch > > > Found that the {{users}} reference count can go into negative values, which > prevents {{tearDownIfUnused}} from closing the client connection when called. > This leads to a build up of clients which have been evicted from the cache, > are no longer in use, but have not been shutdown. > GC will eventually call {{finalize}}, which forcibly closes the connection > and cleans up the client, but I have seen as many as several hundred open > client connections as a result. > The main resource for this is caused by RetryingMetaStoreClient, which will > call {{reconnect}} on acquire, which calls {{close}}. This will decrement > {{users}} to -1 on the reconnect, then acquire will increase this to 0 while > using it, and back to -1 when it releases it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14078) LLAP input split should get task attempt number from conf if available
[ https://issues.apache.org/jira/browse/HIVE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345110#comment-15345110 ] Jason Dere commented on HIVE-14078: --- cc [~sseth] > LLAP input split should get task attempt number from conf if available > -- > > Key: HIVE-14078 > URL: https://issues.apache.org/jira/browse/HIVE-14078 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-14078.1.patch > > > Currently the attempt number is hard-coded to 0. If the split is being > fetched as part of a hadoop job we can get the task attempt ID from the conf > if it has been set, and use the attempt number from that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13990) Client should not check dfs.namenode.acls.enabled to determine if extended ACLs are supported
[ https://issues.apache.org/jira/browse/HIVE-13990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13990: --- Attachment: HIVE-13990.1.patch HIVE-13990.1-branch-1.patch > Client should not check dfs.namenode.acls.enabled to determine if extended > ACLs are supported > - > > Key: HIVE-13990 > URL: https://issues.apache.org/jira/browse/HIVE-13990 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 1.2.1 >Reporter: Chris Drome > Attachments: HIVE-13990-branch-1.patch, HIVE-13990.1-branch-1.patch, > HIVE-13990.1.patch > > > dfs.namenode.acls.enabled is a server side configuration and the client > should not presume to know how the server is configured. Barring a method for > querying the NN whether ACLs are supported the client should try and catch > the appropriate exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13989) Extended ACLs are not handled according to specification
[ https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13989: --- Attachment: HIVE-13989.1.patch HIVE-13989.1-branch-1.patch > Extended ACLs are not handled according to specification > > > Key: HIVE-13989 > URL: https://issues.apache.org/jira/browse/HIVE-13989 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 1.2.1, 2.0.0 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13989-branch-1.patch, HIVE-13989.1-branch-1.patch, > HIVE-13989.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13990) Client should not check dfs.namenode.acls.enabled to determine if extended ACLs are supported
[ https://issues.apache.org/jira/browse/HIVE-13990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345117#comment-15345117 ] Chris Drome commented on HIVE-13990: @ashutosh, we are still on branch-1, so I had this patch readily available for branch-1. There was a bit of work to get HIVE-13989, which this depends on, ported to master. Patches are available for master and branch-1 now. > Client should not check dfs.namenode.acls.enabled to determine if extended > ACLs are supported > - > > Key: HIVE-13990 > URL: https://issues.apache.org/jira/browse/HIVE-13990 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 1.2.1 >Reporter: Chris Drome > Attachments: HIVE-13990-branch-1.patch, HIVE-13990.1-branch-1.patch, > HIVE-13990.1.patch > > > dfs.namenode.acls.enabled is a server side configuration and the client > should not presume to know how the server is configured. Barring a method for > querying the NN whether ACLs are supported the client should try and catch > the appropriate exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13989) Extended ACLs are not handled according to specification
[ https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13989: --- Target Version/s: 2.0.0, 1.2.1 (was: 1.2.1, 2.0.0) Status: Patch Available (was: Open) > Extended ACLs are not handled according to specification > > > Key: HIVE-13989 > URL: https://issues.apache.org/jira/browse/HIVE-13989 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.0.0, 1.2.1 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13989-branch-1.patch, HIVE-13989.1-branch-1.patch, > HIVE-13989.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13754) Fix resource leak in HiveClientCache
[ https://issues.apache.org/jira/browse/HIVE-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13754: --- Target Version/s: 2.0.0, 1.2.1 (was: 1.2.1, 2.0.0) Status: Open (was: Patch Available) > Fix resource leak in HiveClientCache > > > Key: HIVE-13754 > URL: https://issues.apache.org/jira/browse/HIVE-13754 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 2.0.0, 1.2.1 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13754-branch-1.patch, HIVE-13754.1-branch-1.patch, > HIVE-13754.1.patch, HIVE-13754.patch > > > Found that the {{users}} reference count can go into negative values, which > prevents {{tearDownIfUnused}} from closing the client connection when called. > This leads to a build up of clients which have been evicted from the cache, > are no longer in use, but have not been shutdown. > GC will eventually call {{finalize}}, which forcibly closes the connection > and cleans up the client, but I have seen as many as several hundred open > client connections as a result. > The main resource for this is caused by RetryingMetaStoreClient, which will > call {{reconnect}} on acquire, which calls {{close}}. This will decrement > {{users}} to -1 on the reconnect, then acquire will increase this to 0 while > using it, and back to -1 when it releases it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13989) Extended ACLs are not handled according to specification
[ https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13989: --- Target Version/s: 2.0.0, 1.2.1 (was: 1.2.1, 2.0.0) Status: Open (was: Patch Available) > Extended ACLs are not handled according to specification > > > Key: HIVE-13989 > URL: https://issues.apache.org/jira/browse/HIVE-13989 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.0.0, 1.2.1 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13989-branch-1.patch, HIVE-13989.1-branch-1.patch, > HIVE-13989.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13989) Extended ACLs are not handled according to specification
[ https://issues.apache.org/jira/browse/HIVE-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13989: --- Target Version/s: 2.0.0, 1.2.1 (was: 1.2.1, 2.0.0) Status: Patch Available (was: Open) > Extended ACLs are not handled according to specification > > > Key: HIVE-13989 > URL: https://issues.apache.org/jira/browse/HIVE-13989 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.0.0, 1.2.1 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13989-branch-1.patch, HIVE-13989.1-branch-1.patch, > HIVE-13989.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13754) Fix resource leak in HiveClientCache
[ https://issues.apache.org/jira/browse/HIVE-13754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13754: --- Target Version/s: 2.0.0, 1.2.1 (was: 1.2.1, 2.0.0) Status: Patch Available (was: Open) > Fix resource leak in HiveClientCache > > > Key: HIVE-13754 > URL: https://issues.apache.org/jira/browse/HIVE-13754 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 2.0.0, 1.2.1 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13754-branch-1.patch, HIVE-13754.1-branch-1.patch, > HIVE-13754.1.patch, HIVE-13754.patch > > > Found that the {{users}} reference count can go into negative values, which > prevents {{tearDownIfUnused}} from closing the client connection when called. > This leads to a build up of clients which have been evicted from the cache, > are no longer in use, but have not been shutdown. > GC will eventually call {{finalize}}, which forcibly closes the connection > and cleans up the client, but I have seen as many as several hundred open > client connections as a result. > The main resource for this is caused by RetryingMetaStoreClient, which will > call {{reconnect}} on acquire, which calls {{close}}. This will decrement > {{users}} to -1 on the reconnect, then acquire will increase this to 0 while > using it, and back to -1 when it releases it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13756) Map failure attempts to delete reducer _temporary directory on multi-query pig query
[ https://issues.apache.org/jira/browse/HIVE-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13756: --- Target Version/s: 2.0.0, 1.2.1 (was: 1.2.1, 2.0.0) Status: Patch Available (was: Open) > Map failure attempts to delete reducer _temporary directory on multi-query > pig query > > > Key: HIVE-13756 > URL: https://issues.apache.org/jira/browse/HIVE-13756 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.0.0, 1.2.1 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13756-branch-1.patch, HIVE-13756.1-branch-1.patch, > HIVE-13756.1.patch, HIVE-13756.patch > > > A pig script, executed with multi-query enabled, that reads the source data > and writes it as-is into TABLE_A as well as performing a group-by operation > on the data which is written into TABLE_B can produce erroneous results if > any map fails. This results in a single MR job that writes the map output to > a scratch directory relative to TABLE_A and the reducer output to a scratch > directory relative to TABLE_B. > If one or more maps fail it will delete the attempt data relative to TABLE_A, > but it also deletes the _temporary directory relative to TABLE_B. This has > the unintended side-effect of preventing subsequent maps from committing > their data. This means that any maps which successfully completed before the > first map failure will have its data committed as expected, other maps not, > resulting in an incomplete result set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13756) Map failure attempts to delete reducer _temporary directory on multi-query pig query
[ https://issues.apache.org/jira/browse/HIVE-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-13756: --- Target Version/s: 2.0.0, 1.2.1 (was: 1.2.1, 2.0.0) Status: Open (was: Patch Available) > Map failure attempts to delete reducer _temporary directory on multi-query > pig query > > > Key: HIVE-13756 > URL: https://issues.apache.org/jira/browse/HIVE-13756 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 2.0.0, 1.2.1 >Reporter: Chris Drome >Assignee: Chris Drome > Attachments: HIVE-13756-branch-1.patch, HIVE-13756.1-branch-1.patch, > HIVE-13756.1.patch, HIVE-13756.patch > > > A pig script, executed with multi-query enabled, that reads the source data > and writes it as-is into TABLE_A as well as performing a group-by operation > on the data which is written into TABLE_B can produce erroneous results if > any map fails. This results in a single MR job that writes the map output to > a scratch directory relative to TABLE_A and the reducer output to a scratch > directory relative to TABLE_B. > If one or more maps fail it will delete the attempt data relative to TABLE_A, > but it also deletes the _temporary directory relative to TABLE_B. This has > the unintended side-effect of preventing subsequent maps from committing > their data. This means that any maps which successfully completed before the > first map failure will have its data committed as expected, other maps not, > resulting in an incomplete result set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2
[ https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345045#comment-15345045 ] Pengcheng Xiong commented on HIVE-14070: I also want to remove PerfLogger.DRIVER_RUN as well. And use the start time of PerfLogger.COMPILE instead. > hive.tez.exec.print.summary=true returns wrong results on HS2 > - > > Key: HIVE-14070 > URL: https://issues.apache.org/jira/browse/HIVE-14070 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch > > > On master, we have > {code} > Query Execution Summary > -- > OPERATIONDURATION > -- > Compile Query -1466208820.74s > Prepare Plan0.00s > Submit Plan 1466208825.50s > Start DAG 0.26s > Run DAG 4.39s > -- > Task Execution Summary > -- > VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS > OUTPUT_RECORDS > -- > Map 11014.00 1,534 11 1,500 > 1 > Reducer 2 96.00 5410 1 > 0 > -- > {code} > sounds like a real issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14068) make more effort to find hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345024#comment-15345024 ] Hive QA commented on HIVE-14068: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12812346/HIVE-14068.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10254 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/222/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/222/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-222/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12812346 - PreCommit-HIVE-MASTER-Build > make more effort to find hive-site.xml > -- > > Key: HIVE-14068 > URL: https://issues.apache.org/jira/browse/HIVE-14068 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14068.01.patch, HIVE-14068.patch > > > It pretty much doesn't make sense to run Hive w/o the config, so we should > make more effort to find one if it's missing on the classpath, or the > classloader does not return it for some reason (e.g. classloader ignores some > permission issues; explicitly looking for the file may expose them better) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13380) Decimal should have lower precedence than double in type hierachy
[ https://issues.apache.org/jira/browse/HIVE-13380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345027#comment-15345027 ] Sergey Shelukhin commented on HIVE-13380: - Created HIVE-14077 to track > Decimal should have lower precedence than double in type hierachy > - > > Key: HIVE-13380 > URL: https://issues.apache.org/jira/browse/HIVE-13380 > Project: Hive > Issue Type: Bug > Components: Types >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13380.2.patch, HIVE-13380.4.patch, > HIVE-13380.5.patch, HIVE-13380.patch, decimal_filter.q > > > Currently its other way round. Also, decimal should be lower than float. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14055) directSql - getting the number of partitions is broken
[ https://issues.apache.org/jira/browse/HIVE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345051#comment-15345051 ] Sergio Peña commented on HIVE-14055: Agree. Let's fix this in another jira. I took a look the the patch, and it looks good. +1 Let's wait for HiveQA to verify the patch. > directSql - getting the number of partitions is broken > -- > > Key: HIVE-14055 > URL: https://issues.apache.org/jira/browse/HIVE-14055 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14055.01.patch, HIVE-14055.02.patch, > HIVE-14055.patch > > > Noticed while looking at something else. If the filter cannot be pushed down > it just returns 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)