[jira] [Created] (HIVE-27059) Wrong object inspector will be created when use collect_list and disable map-side aggregation
Genmao Yu created HIVE-27059: Summary: Wrong object inspector will be created when use collect_list and disable map-side aggregation Key: HIVE-27059 URL: https://issues.apache.org/jira/browse/HIVE-27059 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 4.0.0-alpha-2, 3.1.3, 2.3.8 Environment: Reporter: Genmao Yu Assignee: Genmao Yu Query will fail when use collect_list (or collect_set) and disable map-side aggregationg: {code:java} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryMap cannot be cast to java.util.Map at org.apache.hadoop.hive.serde2.objectinspector.StandardMapObjectInspector.getMap(StandardMapObjectInspector.java:85) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:437) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:362) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.putIntoCollection(GenericUDAFMkCollectionEvaluator.java:154) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFMkCollectionEvaluator.iterate(GenericUDAFMkCollectionEvaluator.java:120) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:192) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:638) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:877) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:721) at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787) {code} To reproduce this issue: {code:sql} create table tb1 (a int, b string, c string); insert into tb1 values (1, "100", "101"); insert into tb1 values (1, "102", "103"); insert into tb1 values (2, "200", "201"); set hive.map.aggr=false; select a, collect_list(map("b",b,"c",c)) as col1 from tb1 group by a; select a, collect_set(array(b, c)) as col1 from tb1 group by a; {code} To work around this issue: {code:sql} set hive.map.aggr=true; {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [EXTERNAL] Re: Branch-3 backports and build stability
+1, Thanks Stamatis and Lazlo for helping in the test case fixes till now. Team, I need help in fixing the following tests in Hive. I have tried different approaches but no luck till now. I am facing some issues in fixing the following tests : org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver Issue : PREHOOK: Input: default@src PREHOOK: Output: default@src Failed to monitor Job[-1] with exception 'java.lang.IllegalStateException(Connection to remote Spark driver was lost)' Last known state = SENT Failed to execute spark task, with exception 'java.lang.IllegalStateException(RPC channel is closed.)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. RPC channel is closed. History : Initially the tests had failed with errors which I fixed in the following task : https://issues.apache.org/jira/browse/HIVE-26940 Does anyone know what the issue is here ? There are 6-7 failures because of this test case. Link to the failed test cases for the stacktrace : http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3949/2/tests/ Thanks, Aman. From: László Bodor Sent: Tuesday, February 7, 2023 4:46 PM To: dev@hive.apache.org Subject: [EXTERNAL] Re: Branch-3 backports and build stability +1 also, if I merged something that I thought was for test stability (but instead it was a feature), excuse me :) for reference, the whole green test initiative is tracked under this umbrella: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FHIVE-26836&data=05%7C01%7Crajaman%40microsoft.com%7Cc1cbb508eee74c3347e508db08fcdfef%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638113654431055909%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Ztfbm5%2FjUJV5v083%2BFu5%2Fs7mqVBEgCgEBeo5BJFzS8A%3D&reserved=0 Stamatis Zampetakis ezt írta (időpont: 2023. febr. 7., K, 12:09): > Hi all, > > The build in branch-3 is not yet green; there are ~25 test failures. It is > a common practice that we shouldn't push changes on top of a broken build > unless they are addressing test failures. > > Some people (mainly Aman Raj, Chris Nauroth, and Laszlo Bodor) are working > hard to stabilize the build for quite some time now. If you want to help > out then start by reviewing, merging, and fixing things around test > failures. > > It's not yet the time to bring new features, upgrades, bugs, etc., in > branch-3. I would encourage committers to not approve such changes till we > get back to a stable branch. > > Best, > Stamatis >
[jira] [Created] (HIVE-27058) Backport of HIVE-24316: Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
Diksha created HIVE-27058: - Summary: Backport of HIVE-24316: Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1 Key: HIVE-27058 URL: https://issues.apache.org/jira/browse/HIVE-27058 Project: Hive Issue Type: Sub-task Reporter: Diksha Assignee: Diksha -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [ANNOUNCE] New committer for Apache Hive: Laszlo Vegh
Congratulations Laszlo Vegh, Great work on the compaction stuff!! Thanks, Sai. On Tue, Feb 7, 2023 at 4:24 AM Naveen Gangam wrote: > The Project Management Committee (PMC) for Apache Hive has invited Laszlo > Vegh (veghlaci05) to become a committer and we are pleased > to announce that he has accepted. > > Contributions from Laszlo: > > He has authored 25 patches. Significant contributions to stabilization of > ACID compaction. Helped review other patches as well. > > > https://github.com/apache/hive/pulls?q=is%3Amerged+is%3Apr+author%3Aveghlaci05 > > Being a committer enables easier contribution to the project since there > is no need to go via the patch submission process. This should enable > better productivity.A PMC member helps manage and guide the direction of > the project. > > Congratulations > Hive PMC >
[jira] [Created] (HIVE-27057) Revert "HIVE-19166: TestMiniLlapLocalCliDriver sysdb failure"
Aman Raj created HIVE-27057: --- Summary: Revert "HIVE-19166: TestMiniLlapLocalCliDriver sysdb failure" Key: HIVE-27057 URL: https://issues.apache.org/jira/browse/HIVE-27057 Project: Hive Issue Type: Sub-task Reporter: Aman Raj Assignee: Aman Raj The sysdb test fails after this commit was added on branch-3 so reverting it. Before this commit sysdb works fine on branch-3. -- This message was sent by Atlassian Jira (v8.20.10#820010)
RE: [EXTERNAL] Re: [ANNOUNCE] New committer for Apache Hive: Laszlo Vegh
Congrats Laszlo! -Sankar -Original Message- From: Krisztian Kasa Sent: Tuesday, February 7, 2023 6:37 PM To: u...@hive.apache.org Cc: dev Subject: [EXTERNAL] Re: [ANNOUNCE] New committer for Apache Hive: Laszlo Vegh [You don't often get email from kk...@cloudera.com.invalid. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] Congratulations, Laszlo! On Tue, Feb 7, 2023 at 2:06 PM Alessandro Solimando < alessandro.solima...@gmail.com> wrote: > Congrats, Laszlo! > > Best regards, > Alessandro > > On Tue, 7 Feb 2023 at 13:24, Naveen Gangam wrote: > >> The Project Management Committee (PMC) for Apache Hive has invited >> Laszlo Vegh (veghlaci05) to become a committer and we are pleased to >> announce that he has accepted. >> >> Contributions from Laszlo: >> >> He has authored 25 patches. Significant contributions to >> stabilization of ACID compaction. Helped review other patches as well. >> >> >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit >> hub.com%2Fapache%2Fhive%2Fpulls%3Fq%3Dis%253Amerged%2Bis%253Apr%2Baut >> hor%253Aveghlaci05&data=05%7C01%7CSankar.Hariappan%40microsoft.com%7C >> 8570f6625d2a469629bd08db090c4a19%7C72f988bf86f141af91ab2d7cd011db47%7 >> C1%7C0%7C638113720640558057%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM >> DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sda >> ta=I9UPTgKczJ9YjaggQEkt0doJntFiEEGgVzs1N4nqoDg%3D&reserved=0 >> >> Being a committer enables easier contribution to the project since >> there is no need to go via the patch submission process. This should >> enable better productivity.A PMC member helps manage and guide the >> direction of the project. >> >> Congratulations >> Hive PMC >> >
Re: [ANNOUNCE] New committer for Apache Hive: Laszlo Vegh
Congratulations, Laszlo! On Tue, Feb 7, 2023 at 2:06 PM Alessandro Solimando < alessandro.solima...@gmail.com> wrote: > Congrats, Laszlo! > > Best regards, > Alessandro > > On Tue, 7 Feb 2023 at 13:24, Naveen Gangam wrote: > >> The Project Management Committee (PMC) for Apache Hive has invited Laszlo >> Vegh (veghlaci05) to become a committer and we are pleased >> to announce that he has accepted. >> >> Contributions from Laszlo: >> >> He has authored 25 patches. Significant contributions to stabilization of >> ACID compaction. Helped review other patches as well. >> >> >> https://github.com/apache/hive/pulls?q=is%3Amerged+is%3Apr+author%3Aveghlaci05 >> >> Being a committer enables easier contribution to the project since there >> is no need to go via the patch submission process. This should enable >> better productivity.A PMC member helps manage and guide the direction of >> the project. >> >> Congratulations >> Hive PMC >> >
Re: [ANNOUNCE] New committer for Apache Hive: Laszlo Vegh
Congrats, Laszlo! Best regards, Alessandro On Tue, 7 Feb 2023 at 13:24, Naveen Gangam wrote: > The Project Management Committee (PMC) for Apache Hive has invited Laszlo > Vegh (veghlaci05) to become a committer and we are pleased > to announce that he has accepted. > > Contributions from Laszlo: > > He has authored 25 patches. Significant contributions to stabilization of > ACID compaction. Helped review other patches as well. > > > https://github.com/apache/hive/pulls?q=is%3Amerged+is%3Apr+author%3Aveghlaci05 > > Being a committer enables easier contribution to the project since there > is no need to go via the patch submission process. This should enable > better productivity.A PMC member helps manage and guide the direction of > the project. > > Congratulations > Hive PMC >
[ANNOUNCE] New committer for Apache Hive: Laszlo Vegh
The Project Management Committee (PMC) for Apache Hive has invited Laszlo Vegh (veghlaci05) to become a committer and we are pleased to announce that he has accepted. Contributions from Laszlo: He has authored 25 patches. Significant contributions to stabilization of ACID compaction. Helped review other patches as well. https://github.com/apache/hive/pulls?q=is%3Amerged+is%3Apr+author%3Aveghlaci05 Being a committer enables easier contribution to the project since there is no need to go via the patch submission process. This should enable better productivity.A PMC member helps manage and guide the direction of the project. Congratulations Hive PMC
[jira] [Created] (HIVE-27056) Ensure that MR distcp goes to the same Yarn queue as other parts of the query
László Bodor created HIVE-27056: --- Summary: Ensure that MR distcp goes to the same Yarn queue as other parts of the query Key: HIVE-27056 URL: https://issues.apache.org/jira/browse/HIVE-27056 Project: Hive Issue Type: Improvement Reporter: László Bodor -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: Branch-3 backports and build stability
+1 also, if I merged something that I thought was for test stability (but instead it was a feature), excuse me :) for reference, the whole green test initiative is tracked under this umbrella: https://issues.apache.org/jira/browse/HIVE-26836 Stamatis Zampetakis ezt írta (időpont: 2023. febr. 7., K, 12:09): > Hi all, > > The build in branch-3 is not yet green; there are ~25 test failures. It is > a common practice that we shouldn't push changes on top of a broken build > unless they are addressing test failures. > > Some people (mainly Aman Raj, Chris Nauroth, and Laszlo Bodor) are working > hard to stabilize the build for quite some time now. If you want to help > out then start by reviewing, merging, and fixing things around test > failures. > > It's not yet the time to bring new features, upgrades, bugs, etc., in > branch-3. I would encourage committers to not approve such changes till we > get back to a stable branch. > > Best, > Stamatis >
Branch-3 backports and build stability
Hi all, The build in branch-3 is not yet green; there are ~25 test failures. It is a common practice that we shouldn't push changes on top of a broken build unless they are addressing test failures. Some people (mainly Aman Raj, Chris Nauroth, and Laszlo Bodor) are working hard to stabilize the build for quite some time now. If you want to help out then start by reviewing, merging, and fixing things around test failures. It's not yet the time to bring new features, upgrades, bugs, etc., in branch-3. I would encourage committers to not approve such changes till we get back to a stable branch. Best, Stamatis
[jira] [Created] (HIVE-27055) hive-exec typos part 3
Michal Lorek created HIVE-27055: --- Summary: hive-exec typos part 3 Key: HIVE-27055 URL: https://issues.apache.org/jira/browse/HIVE-27055 Project: Hive Issue Type: Improvement Components: Query Planning, Query Processor Affects Versions: 4.0.0 Reporter: Michal Lorek multiple typos and grammar errors in hive-exec module code and comments -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27054) Backport HIVE-16812 VectorizedOrcAcidRowBatchReader doesn't filter delete events
Indhumathi Muthumurugesh created HIVE-27054: --- Summary: Backport HIVE-16812 VectorizedOrcAcidRowBatchReader doesn't filter delete events Key: HIVE-27054 URL: https://issues.apache.org/jira/browse/HIVE-27054 Project: Hive Issue Type: Bug Reporter: Indhumathi Muthumurugesh Fix For: 4.0.0-alpha-1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27053) Backport HIVE-20294 Vectorization: Fix NULL / Wrong Results issues in COALESCE / ELT
Indhumathi Muthumurugesh created HIVE-27053: --- Summary: Backport HIVE-20294 Vectorization: Fix NULL / Wrong Results issues in COALESCE / ELT Key: HIVE-27053 URL: https://issues.apache.org/jira/browse/HIVE-27053 Project: Hive Issue Type: Bug Reporter: Indhumathi Muthumurugesh Fix For: 4.0.0-alpha-1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27052) Backport HIVE-26080: Upgrade accumulo-core to 1.10.1 in branch-3
Raghav Aggarwal created HIVE-27052: -- Summary: Backport HIVE-26080: Upgrade accumulo-core to 1.10.1 in branch-3 Key: HIVE-27052 URL: https://issues.apache.org/jira/browse/HIVE-27052 Project: Hive Issue Type: Improvement Reporter: Raghav Aggarwal Assignee: Raghav Aggarwal -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27051) Backport HIVE-22132: Upgrade commons-lang3 version to 3.9
Raghav Aggarwal created HIVE-27051: -- Summary: Backport HIVE-22132: Upgrade commons-lang3 version to 3.9 Key: HIVE-27051 URL: https://issues.apache.org/jira/browse/HIVE-27051 Project: Hive Issue Type: Improvement Reporter: Raghav Aggarwal Assignee: Raghav Aggarwal -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27050) Iceberg: MOR: Restrict reducer extrapolation to contain number of small files being created
Rajesh Balamohan created HIVE-27050: --- Summary: Iceberg: MOR: Restrict reducer extrapolation to contain number of small files being created Key: HIVE-27050 URL: https://issues.apache.org/jira/browse/HIVE-27050 Project: Hive Issue Type: Improvement Components: Iceberg integration Reporter: Rajesh Balamohan Scenario: # Create a simple table in iceberg (MOR mode). e.g store_sales_delete_1 # Insert some data into it. # Run an update statement as follows ## "update store_sales_delete_1 set ss_sold_time_sk=699060 where ss_sold_time_sk=69906" Hive estimates the number of reducers as "1". But due to "hive.tez.max.partition.factor" which defaults to "2.0", it will double the number of reducers. To put in perspective, it will create very small positional delete files spreading across different reducers. This will cause problems during reading, as all files should be opened for reading. # When iceberg MOR tables are involved in update/delete/merges, disable "hive.tez.max.partition.factor"; or set it to "1.0" irrespective of the user setting; # Have explicit logs for easier debugging; User shouldn't be confused on why the setting is not taking into effect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-27049) Iceberg: Provide current snapshot version in show-create-table
Rajesh Balamohan created HIVE-27049: --- Summary: Iceberg: Provide current snapshot version in show-create-table Key: HIVE-27049 URL: https://issues.apache.org/jira/browse/HIVE-27049 Project: Hive Issue Type: Improvement Components: Iceberg integration Reporter: Rajesh Balamohan It will be helpful to show "current snapshot" id in "show create table" statement. This will help in easier debugging. Otherwise, user has to explicitly query the metadata or read the JSON file to get this info. -- This message was sent by Atlassian Jira (v8.20.10#820010)