[jira] [Created] (HIVE-27952) Hive fails to create SslContextFactory when KeyStore has multiple certificates
Seonggon Namgung created HIVE-27952: --- Summary: Hive fails to create SslContextFactory when KeyStore has multiple certificates Key: HIVE-27952 URL: https://issues.apache.org/jira/browse/HIVE-27952 Project: Hive Issue Type: Bug Reporter: Seonggon Namgung Assignee: Seonggon Namgung With Jetty 9.4.40, we should call SslContextFactory.Server(), instead of SslContextFactory(), to create SslContextFactory. Otherwise we get the following error when using a KeyStore with multiple certificates in it. {code:java} Caused by: java.lang.IllegalStateException: KeyStores with multiple certificates are not supported on the base class org.eclipse.jetty.util.ssl.SslContextFactory. (Use org.eclipse.jetty.util.ssl.SslContextFactory$Server or org.eclipse.jetty.util.ssl.SslContextFactory$Client instead) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27950) STACK UDTF returns wrong results when # of argument is not a multiple of N
[ https://issues.apache.org/jira/browse/HIVE-27950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] okumin updated HIVE-27950: -- Status: Patch Available (was: Open) > STACK UDTF returns wrong results when # of argument is not a multiple of N > -- > > Key: HIVE-27950 > URL: https://issues.apache.org/jira/browse/HIVE-27950 > Project: Hive > Issue Type: Bug >Reporter: okumin >Assignee: okumin >Priority: Major > Labels: pull-request-available > > GenericUDTFStack nullifies a wrong cell when the number of values is > indivisible. In the following case, the `col2` column of the last row should > be `NULL`. But, `col1` is NULL somehow. > {code:java} > 0: jdbc:hive2://hive-hiveserver2:1/defaul> select stack(2, 'a', 'b', 'c', > 'd', 'e'); > +---+---+---+ > | col0 | col1 | col2 | > +---+---+---+ > | a | b | c | > | d | NULL | c | > +---+---+---+{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-21213) Acid table bootstrap replication needs to handle directory created by compaction with txn id
[ https://issues.apache.org/jira/browse/HIVE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HIVE-21213: Release Note: (was: Merged. Thanks.) > Acid table bootstrap replication needs to handle directory created by > compaction with txn id > > > Key: HIVE-21213 > URL: https://issues.apache.org/jira/browse/HIVE-21213 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2, repl >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-21213.01.patch, HIVE-21213.02.patch, > HIVE-21213.03.patch, HIVE-21213.04.patch, HIVE-21213.05.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > The current implementation of compaction uses the txn id in the directory > name. This is used to isolate the queries from reading the directory until > compaction has finished and to avoid the compactor marking used earlier. In > case of replication, during bootstrap , directory is copied as it is with the > same name from source to destination cluster. But the directory created by > compaction with txn id can not be copied as the txn list at target may be > different from source. The txn id which is valid at source may be an aborted > txn at target. So conversion logic is required to create a new directory with > valid txn at target and dump the data to the newly created directory. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-27446) Exception when rebuild materialized view incrementally in presence of delete operations
[ https://issues.apache.org/jira/browse/HIVE-27446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa resolved HIVE-27446. --- Fix Version/s: 4.0.0 Resolution: Fixed Merged to master. Thanks [~lvegh] for review. > Exception when rebuild materialized view incrementally in presence of delete > operations > --- > > Key: HIVE-27446 > URL: https://issues.apache.org/jira/browse/HIVE-27446 > Project: Hive > Issue Type: Bug > Components: CBO, Materialized views >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > {code} > create table cmv_basetable_n6 (a int, b varchar(256), c decimal(10,2), d int) > stored as orc TBLPROPERTIES ('transactional'='true'); > insert into cmv_basetable_n6 values > (1, 'alfred', 10.30, 2), > (2, 'bob', 3.14, 3), > (2, 'bonnie', 172342.2, 3), > (3, 'calvin', 978.76, 3), > (3, 'charlie', 9.8, 1); > create table cmv_basetable_2_n3 (a int, b varchar(256), c decimal(10,2), d > int) stored as orc TBLPROPERTIES ('transactional'='true'); > insert into cmv_basetable_2_n3 values > (1, 'alfred', 10.30, 2), > (3, 'calvin', 978.76, 3); > CREATE MATERIALIZED VIEW cmv_mat_view_n6 > TBLPROPERTIES ('transactional'='true') AS > SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c > FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a = > cmv_basetable_2_n3.a) > WHERE cmv_basetable_2_n3.c > 10.0; > DELETE from cmv_basetable_2_n3 WHERE a=1; > ALTER MATERIALIZED VIEW cmv_mat_view_n6 REBUILD; > DELETE FROM cmv_basetable_n6 WHERE a=1; > ALTER MATERIALIZED VIEW cmv_mat_view_n6 REBUILD; > {code} > The second rebuild fails > {code} > org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, > vertexName=Reducer 3, vertexId=vertex_1686925588164_0001_7_06, > diagnostics=[Task failed, taskId=task_1686925588164_0001_7_06_00, > diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( > failure ) : > attempt_1686925588164_0001_7_06_00_0:java.lang.RuntimeException: > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: > Hive Runtime Error while processing row > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:313) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:293) > ... 15 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:387) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:303) > ... 17 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:36) > at >
[jira] [Resolved] (HIVE-27803) Bump org.apache.avro:avro from 1.11.1 to 1.11.3
[ https://issues.apache.org/jira/browse/HIVE-27803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HIVE-27803. - Fix Version/s: 4.0.0 Resolution: Fixed > Bump org.apache.avro:avro from 1.11.1 to 1.11.3 > --- > > Key: HIVE-27803 > URL: https://issues.apache.org/jira/browse/HIVE-27803 > Project: Hive > Issue Type: Improvement >Reporter: Ayush Saxena >Priority: Major > Fix For: 4.0.0 > > > PR from *[dependabot|https://github.com/apps/dependabot]* > https://github.com/apache/hive/pull/4764 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27803) Bump org.apache.avro:avro from 1.11.1 to 1.11.3
[ https://issues.apache.org/jira/browse/HIVE-27803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795887#comment-17795887 ] Ayush Saxena commented on HIVE-27803: - Resolved Thanx Akshat for reminding!!! > Bump org.apache.avro:avro from 1.11.1 to 1.11.3 > --- > > Key: HIVE-27803 > URL: https://issues.apache.org/jira/browse/HIVE-27803 > Project: Hive > Issue Type: Improvement >Reporter: Ayush Saxena >Priority: Major > Fix For: 4.0.0 > > > PR from *[dependabot|https://github.com/apps/dependabot]* > https://github.com/apache/hive/pull/4764 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27672) Iceberg: Truncate partition support
[ https://issues.apache.org/jira/browse/HIVE-27672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HIVE-27672: Issue Type: Improvement (was: New Feature) > Iceberg: Truncate partition support > --- > > Key: HIVE-27672 > URL: https://issues.apache.org/jira/browse/HIVE-27672 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Support the following truncate operations on a partition level - > {code:java} > TRUNCATE TABLE tableName PARTITION (partCol1 = partValue1, partCol2 = > partValue2);{code} > Truncate is not supported for partition transforms. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27627) Iceberg: Insert into/overwrite partition support
[ https://issues.apache.org/jira/browse/HIVE-27627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HIVE-27627: Issue Type: Improvement (was: New Feature) > Iceberg: Insert into/overwrite partition support > > > Key: HIVE-27627 > URL: https://issues.apache.org/jira/browse/HIVE-27627 > Project: Hive > Issue Type: Improvement >Reporter: Sourabh Badhya >Assignee: Sourabh Badhya >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Support inserting data in the following query types - > Inserting data via static partition - > {code:java} > INSERT INTO|OVERWRITE TABLE tableName PARTITION(pCol = pColValue) VALUES > (...); > INSERT INTO|OVERWRITE TABLE tableName PARTITION(pCol = pColValue) SELECT > query;{code} > Inserting data via dynamic partitioning - > {code:java} > INSERT INTO|OVERWRITE TABLE tableName PARTITION(pCol) VALUES (...); > INSERT INTO|OVERWRITE TABLE tableName PARTITION(pCol) SELECT query; {code} > Inserting data via static and dynamic partitioning with static partitioning > coming at the beginning - > {code:java} > INSERT INTO|OVERWRITE TABLE tableName PARTITION(pCol1 = pColValue, pCol2) > VALUES (...); > INSERT INTO|OVERWRITE TABLE tableName PARTITION(pCol1 = pColValue, pCol2) > SELECT query;{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27324) Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value.
[ https://issues.apache.org/jira/browse/HIVE-27324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena updated HIVE-27324: Flags: (was: Important) > Hive query with NOT IN condition is giving incorrect results when the sub > query table contains the null value. > -- > > Key: HIVE-27324 > URL: https://issues.apache.org/jira/browse/HIVE-27324 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.1.3 >Reporter: Shobika Selvaraj >Assignee: Diksha >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: sql_queries.png > > > Hive query giving empty results when the sub query table contains the null > value. > We encountered two issues here. > 1) The query - "select * from t3 where age not in (select distinct(age) age > from t1);" is giving empty results when the table t1 contains a null value. > Disabling cbo didn't helped here. > 2) Let's consider the table t3 has null value and table t1 doesn't have any > null values. Now if we run the above query it is returning other data's but > not the null value from t3. If we disable the cbo then it's giving result > with null value. > > *REPRO STEPS WITH DETAILED EXPLANATION AS BELOW:* > *FIRST ISSUE:* > Create two tables and insert data with null values as below: > --- > create table t3 (id int,name string, age int); > > insert into t3 > values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45); > > create table t1 (id int,name string, age int); > insert into t1 > values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23); > --- > > Then executed the below query: > -- > select * from t3 > where age not in (select distinct(age) age from t1); > -- > > The result should be as below: > {code:java} > ++--+-+ > | t3.id | t3.name | t3.age | > ++--+-+ > | 9 | ron | 3 | > | 10 | Sam | 22 | > | 11 | nick | 19 | > | 12 | fed | 18 | > | 13 | kong | 13 | > ++--+-+ > 5 rows selected (35.897 seconds) {code} > But when we run the above query it is giving zero records: > {code:java} > 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3 > . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select > distinct(age) age from t1); > INFO : Compiling > command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96): > select * from t3 > where age not in (select distinct(age) age from t1) > INFO : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross > product > .. > . > INFO : Completed executing > command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96); > Time taken: 10.191 seconds > INFO : OK > ++--+-+ > | t3.id | t3.name | t3.age | > ++--+-+ > ++--+-+ > No rows selected (12.17 seconds) {code} > The query works fine when we use nvl function or not null condition. > > So as a workaround we can use nvl function for both main and sub query as > below: > {code:java} > select * from t3 where nvl(age,'-') not in (select distinct(nvl(age,'-')) age > from t1); {code} > > *SECOND ISSUE:* > Also while testing multiple scenario's i found one more issue as well. > When the sub query table (t1) doesn't contain any null values then the query > is giving result but it is ignoring the null values of the main table(t3) . > > For example: Created another table t4 and inserted the data's without any > null values: > create table t4 (id int,name string, age int); > insert into t4 > values(1,'Sagar',23),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23); > Now i tested with the below query and it gives 5 records. The count should be > six and it omitted the null value of the table t3: > {code:java} > 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3 > . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select > distinct(age) age from t4); > INFO : Compiling > command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c): > select * from t3 > where age not in (select distinct(age) age from t4) > INFO : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross > product > .. > .. > INFO : Completed executing > command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c); > Time taken: 17.724
[jira] [Updated] (HIVE-27800) Metastore: Make database installation tests running on ARM chipset
[ https://issues.apache.org/jira/browse/HIVE-27800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshat Mathur updated HIVE-27800: - Labels: Arm64 (was: ) > Metastore: Make database installation tests running on ARM chipset > -- > > Key: HIVE-27800 > URL: https://issues.apache.org/jira/browse/HIVE-27800 > Project: Hive > Issue Type: Task > Components: Standalone Metastore >Reporter: Zsolt Miskolczi >Priority: Major > Labels: Arm64 > > Those tests are running docker containers, on linux/x86_64 platform and they > cannot start a docker image at all on ARM based processor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27803) Bump org.apache.avro:avro from 1.11.1 to 1.11.3
[ https://issues.apache.org/jira/browse/HIVE-27803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795853#comment-17795853 ] Akshat Mathur commented on HIVE-27803: -- [~ayushtkn] as the PR is merged along with the qtests [https://github.com/apache/hive/pull/4918] should we resolve this ticket? > Bump org.apache.avro:avro from 1.11.1 to 1.11.3 > --- > > Key: HIVE-27803 > URL: https://issues.apache.org/jira/browse/HIVE-27803 > Project: Hive > Issue Type: Improvement >Reporter: Ayush Saxena >Priority: Major > > PR from *[dependabot|https://github.com/apps/dependabot]* > https://github.com/apache/hive/pull/4764 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27937) Clarifying comments and xml configs around tez container size
[ https://issues.apache.org/jira/browse/HIVE-27937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-27937: Summary: Clarifying comments and xml configs around tez container size (was: Clarifying comments around tez container size) > Clarifying comments and xml configs around tez container size > - > > Key: HIVE-27937 > URL: https://issues.apache.org/jira/browse/HIVE-27937 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > the comment in HiveConf about hive.tez.container.size is useless, let's > improve it: > https://github.com/apache/hive/blob/1e4f488394d19ea51766e0633a605e078d8558c3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L2463 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27937) Clarifying comments around tez container size
[ https://issues.apache.org/jira/browse/HIVE-27937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-27937: Fix Version/s: 4.0.0 > Clarifying comments around tez container size > - > > Key: HIVE-27937 > URL: https://issues.apache.org/jira/browse/HIVE-27937 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > the comment in HiveConf about hive.tez.container.size is useless, let's > improve it: > https://github.com/apache/hive/blob/1e4f488394d19ea51766e0633a605e078d8558c3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L2463 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-27937) Clarifying comments around tez container size
[ https://issues.apache.org/jira/browse/HIVE-27937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-27937: Description: the comment in HiveConf about hive.tez.container.size is useless, let's improve it: https://github.com/apache/hive/blob/1e4f488394d19ea51766e0633a605e078d8558c3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L2463 was: the comment in HiveConf about hive.tez.container.size is totally useless: https://github.com/apache/hive/blob/1e4f488394d19ea51766e0633a605e078d8558c3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L2463 > Clarifying comments around tez container size > - > > Key: HIVE-27937 > URL: https://issues.apache.org/jira/browse/HIVE-27937 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > > the comment in HiveConf about hive.tez.container.size is useless, let's > improve it: > https://github.com/apache/hive/blob/1e4f488394d19ea51766e0633a605e078d8558c3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L2463 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27924) Incremental rebuild goes wrong when inserts and deletes overlap between the source tables
[ https://issues.apache.org/jira/browse/HIVE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17795740#comment-17795740 ] Krisztian Kasa commented on HIVE-27924: --- A draft patch created to address the issue when the MV definition has aggregate. I'm working on the part which handles the non-aggregate case. > Incremental rebuild goes wrong when inserts and deletes overlap between the > source tables > - > > Key: HIVE-27924 > URL: https://issues.apache.org/jira/browse/HIVE-27924 > Project: Hive > Issue Type: Bug > Components: Materialized views >Affects Versions: 4.0.0-beta-1 > Environment: * Docker version : 19.03.6 > * Hive version : 4.0.0-beta-1 > * Driver version : Hive JDBC (4.0.0-beta-1) > * Beeline version : 4.0.0-beta-1 >Reporter: Wenhao Li >Assignee: Krisztian Kasa >Priority: Critical > Labels: bug, hive, hive-4.0.0-must, known_issue, > materializedviews, pull-request-available > Attachments: 截图.PNG, 截图1.PNG, 截图2.PNG, 截图3.PNG, 截图4.PNG, 截图5.PNG, > 截图6.PNG, 截图7.PNG, 截图8.PNG, 截图9.PNG > > > h1. Summary > The incremental rebuild plan and execution output are incorrect when one side > of the table join has inserted/deleted join keys that the other side has > deleted/inserted (note the order). > The argument is that tuples that have never been present simultaneously > should not interact with one another, i.e., one's inserts should not join the > other's deletes. > h1. Related Test Case > The bug was discovered during replication of the test case: > ??hive/ql/src/test/queries/clientpositive/materialized_view_create_rewrite_5.q?? > h1. Steps to Reproduce the Issue > # Configurations: > {code:sql} > SET hive.vectorized.execution.enabled=false; > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > set hive.strict.checks.cartesian.product=false; > set hive.materializedview.rewriting=true;{code} > # > {code:sql} > create table cmv_basetable_n6 (a int, b varchar(256), c decimal(10,2), d int) > stored as orc TBLPROPERTIES ('transactional'='true'); {code} > # > {code:sql} > insert into cmv_basetable_n6 values > (1, 'alfred', 10.30, 2), > (1, 'charlie', 20.30, 2); {code} > # > {code:sql} > create table cmv_basetable_2_n3 (a int, b varchar(256), c decimal(10,2), d > int) stored as orc TBLPROPERTIES ('transactional'='true'); {code} > # > {code:sql} > insert into cmv_basetable_2_n3 values > (1, 'bob', 30.30, 2), > (1, 'bonnie', 40.30, 2);{code} > # > {code:sql} > CREATE MATERIALIZED VIEW cmv_mat_view_n6 TBLPROPERTIES > ('transactional'='true') AS > SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c > FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a = > cmv_basetable_2_n3.a) > WHERE cmv_basetable_2_n3.c > 10.0;{code} > # > {code:sql} > show tables; {code} > !截图.PNG! > # Select tuples, including deletion and with VirtualColumn's, from the MV > and source tables. We see that the MV is correctly built upon creation: > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_mat_view_n6('acid.fetch.deleted.rows'='true');{code} > !截图1.PNG! > # > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_basetable_n6('acid.fetch.deleted.rows'='true'); {code} > !截图2.PNG! > # > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_basetable_2_n3('acid.fetch.deleted.rows'='true'); {code} > !截图3.PNG! > # Now make an insert to the LHS and a delete to the RHS source table: > {code:sql} > insert into cmv_basetable_n6 values > (1, 'kevin', 50.30, 2); > DELETE FROM cmv_basetable_2_n3 WHERE b = 'bonnie';{code} > # Select again to see what happened: > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_basetable_n6('acid.fetch.deleted.rows'='true'); {code} > !截图4.PNG! > # > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_basetable_2_n3('acid.fetch.deleted.rows'='true'); {code} > !截图5.PNG! > # Use {{EXPLAIN CBO}} to produce the incremental rebuild plan for the MV, > which is incorrect already: > {code:sql} > EXPLAIN CBO > ALTER MATERIALIZED VIEW cmv_mat_view_n6 REBUILD; {code} > !截图6.PNG! > # Rebuild MV and see (incorrect) results: > {code:sql} > ALTER MATERIALIZED VIEW cmv_mat_view_n6 REBUILD; > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_mat_view_n6('acid.fetch.deleted.rows'='true');{code} > !截图7.PNG! > # Run MV definition directly, which outputs incorrect results because the MV > is enabled for MV-based query rewrite, i.e., the following query will output > what's in the MV for the time being: > {code:sql} > SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c > FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a = > cmv_basetable_2_n3.a) > WHERE cmv_basetable_2_n3.c > 10.0;
[jira] [Updated] (HIVE-27924) Incremental rebuild goes wrong when inserts and deletes overlap between the source tables
[ https://issues.apache.org/jira/browse/HIVE-27924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-27924: -- Labels: bug hive hive-4.0.0-must known_issue materializedviews pull-request-available (was: bug hive hive-4.0.0-must known_issue materializedviews) > Incremental rebuild goes wrong when inserts and deletes overlap between the > source tables > - > > Key: HIVE-27924 > URL: https://issues.apache.org/jira/browse/HIVE-27924 > Project: Hive > Issue Type: Bug > Components: Materialized views >Affects Versions: 4.0.0-beta-1 > Environment: * Docker version : 19.03.6 > * Hive version : 4.0.0-beta-1 > * Driver version : Hive JDBC (4.0.0-beta-1) > * Beeline version : 4.0.0-beta-1 >Reporter: Wenhao Li >Assignee: Krisztian Kasa >Priority: Critical > Labels: bug, hive, hive-4.0.0-must, known_issue, > materializedviews, pull-request-available > Attachments: 截图.PNG, 截图1.PNG, 截图2.PNG, 截图3.PNG, 截图4.PNG, 截图5.PNG, > 截图6.PNG, 截图7.PNG, 截图8.PNG, 截图9.PNG > > > h1. Summary > The incremental rebuild plan and execution output are incorrect when one side > of the table join has inserted/deleted join keys that the other side has > deleted/inserted (note the order). > The argument is that tuples that have never been present simultaneously > should not interact with one another, i.e., one's inserts should not join the > other's deletes. > h1. Related Test Case > The bug was discovered during replication of the test case: > ??hive/ql/src/test/queries/clientpositive/materialized_view_create_rewrite_5.q?? > h1. Steps to Reproduce the Issue > # Configurations: > {code:sql} > SET hive.vectorized.execution.enabled=false; > set hive.support.concurrency=true; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > set hive.strict.checks.cartesian.product=false; > set hive.materializedview.rewriting=true;{code} > # > {code:sql} > create table cmv_basetable_n6 (a int, b varchar(256), c decimal(10,2), d int) > stored as orc TBLPROPERTIES ('transactional'='true'); {code} > # > {code:sql} > insert into cmv_basetable_n6 values > (1, 'alfred', 10.30, 2), > (1, 'charlie', 20.30, 2); {code} > # > {code:sql} > create table cmv_basetable_2_n3 (a int, b varchar(256), c decimal(10,2), d > int) stored as orc TBLPROPERTIES ('transactional'='true'); {code} > # > {code:sql} > insert into cmv_basetable_2_n3 values > (1, 'bob', 30.30, 2), > (1, 'bonnie', 40.30, 2);{code} > # > {code:sql} > CREATE MATERIALIZED VIEW cmv_mat_view_n6 TBLPROPERTIES > ('transactional'='true') AS > SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c > FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a = > cmv_basetable_2_n3.a) > WHERE cmv_basetable_2_n3.c > 10.0;{code} > # > {code:sql} > show tables; {code} > !截图.PNG! > # Select tuples, including deletion and with VirtualColumn's, from the MV > and source tables. We see that the MV is correctly built upon creation: > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_mat_view_n6('acid.fetch.deleted.rows'='true');{code} > !截图1.PNG! > # > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_basetable_n6('acid.fetch.deleted.rows'='true'); {code} > !截图2.PNG! > # > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_basetable_2_n3('acid.fetch.deleted.rows'='true'); {code} > !截图3.PNG! > # Now make an insert to the LHS and a delete to the RHS source table: > {code:sql} > insert into cmv_basetable_n6 values > (1, 'kevin', 50.30, 2); > DELETE FROM cmv_basetable_2_n3 WHERE b = 'bonnie';{code} > # Select again to see what happened: > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_basetable_n6('acid.fetch.deleted.rows'='true'); {code} > !截图4.PNG! > # > {code:sql} > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_basetable_2_n3('acid.fetch.deleted.rows'='true'); {code} > !截图5.PNG! > # Use {{EXPLAIN CBO}} to produce the incremental rebuild plan for the MV, > which is incorrect already: > {code:sql} > EXPLAIN CBO > ALTER MATERIALIZED VIEW cmv_mat_view_n6 REBUILD; {code} > !截图6.PNG! > # Rebuild MV and see (incorrect) results: > {code:sql} > ALTER MATERIALIZED VIEW cmv_mat_view_n6 REBUILD; > SELECT ROW__IS__DELETED, ROW__ID, * FROM > cmv_mat_view_n6('acid.fetch.deleted.rows'='true');{code} > !截图7.PNG! > # Run MV definition directly, which outputs incorrect results because the MV > is enabled for MV-based query rewrite, i.e., the following query will output > what's in the MV for the time being: > {code:sql} > SELECT cmv_basetable_n6.a, cmv_basetable_2_n3.c > FROM cmv_basetable_n6 JOIN cmv_basetable_2_n3 ON (cmv_basetable_n6.a = > cmv_basetable_2_n3.a) > WHERE cmv_basetable_2_n3.c > 10.0; {code} > !截图8.PNG! > # Disable
[jira] [Updated] (HIVE-24730) Shims classes override values from hive-site.xml and tez-site.xml silently
[ https://issues.apache.org/jira/browse/HIVE-24730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24730: Fix Version/s: 4.0.0 > Shims classes override values from hive-site.xml and tez-site.xml silently > -- > > Key: HIVE-24730 > URL: https://issues.apache.org/jira/browse/HIVE-24730 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Since HIVE-14887, > [Hadoop23Shims|https://github.com/apache/hive/blob/master/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java] > silently overrides e.g. hive.tez.container.size which is defined in > data/conf/hive/llap/hive-site.xml. This way, the developer will have no idea > about what happened after setting those values in the xml. > My proposal: > 1. don't set those values, unless they contain the default value (e.g.: -1 > for hive.tez.container.size) > 2. put an INFO level log message about the override > OR: > put a comment in hive-site.xml and tez-site.xml files that shims override it > while creating a tez mini cluster -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-24730) Shims classes override values from hive-site.xml and tez-site.xml silently
[ https://issues.apache.org/jira/browse/HIVE-24730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor resolved HIVE-24730. - Resolution: Fixed > Shims classes override values from hive-site.xml and tez-site.xml silently > -- > > Key: HIVE-24730 > URL: https://issues.apache.org/jira/browse/HIVE-24730 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Since HIVE-14887, > [Hadoop23Shims|https://github.com/apache/hive/blob/master/shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java] > silently overrides e.g. hive.tez.container.size which is defined in > data/conf/hive/llap/hive-site.xml. This way, the developer will have no idea > about what happened after setting those values in the xml. > My proposal: > 1. don't set those values, unless they contain the default value (e.g.: -1 > for hive.tez.container.size) > 2. put an INFO level log message about the override > OR: > put a comment in hive-site.xml and tez-site.xml files that shims override it > while creating a tez mini cluster -- This message was sent by Atlassian Jira (v8.20.10#820010)