[jira] [Updated] (HIVE-26265) REPL DUMP should filter out OpenXacts and unneeded CommitXact/Abort.
[ https://issues.apache.org/jira/browse/HIVE-26265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26265: -- Labels: pull-request-available (was: ) > REPL DUMP should filter out OpenXacts and unneeded CommitXact/Abort. > > > Key: HIVE-26265 > URL: https://issues.apache.org/jira/browse/HIVE-26265 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: francis pang >Assignee: francis pang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > REPL DUMP is replication all OpenXacts, even when they are from other non > replicated databases. This wastes space in the dump, and ends up opening > unneeded transactions during REPL LOAD. > > Add a config property for replication that filters out OpenXact events during > REPL DUMP. During REPL LOAD, the txns can be implicitly opened when the > ALLOC_WRITE_ID is processed. For CommitTxn and AbortTxn, dump only if WRITE > ID was allocated. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26265) REPL DUMP should filter out OpenXacts and unneeded CommitXact/Abort.
[ https://issues.apache.org/jira/browse/HIVE-26265?focusedWorklogId=780992=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780992 ] ASF GitHub Bot logged work on HIVE-26265: - Author: ASF GitHub Bot Created on: 14/Jun/22 05:34 Start Date: 14/Jun/22 05:34 Worklog Time Spent: 10m Work Description: cmunkey opened a new pull request, #3365: URL: https://github.com/apache/hive/pull/3365 Purpose: Currently, all Txn events OpenTxn, CommitTxn, and RollbackTxn are included in the REPL DUMP, even when the transcation does not involve the database being dumped (replicated). These events are unnecessary and result is excessive space required for the dump, as well as increasing work that results from these events being replayed during REPL LOAD. Solution proposed: To reduce this unnecessary space and work, added the hive.repl.filter.transactions configuration property. When set to "true", extra Txn events will be filtered out as follows: CommitTxn and RollbackTxn are included in the REPL DUMP only if the transaction referenced had a corresponding ALLOCATE_WRITE_ID event that was dumped. OpenTxn is never dumped, and the OpenTxn event will be implcitly Opened when REPL LOAD processes the ALLOC_WRITE_ID event, since the ALLOC_WRITE_ID contains the open transaction ids. The default setting is "false". ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 780992) Remaining Estimate: 0h Time Spent: 10m > REPL DUMP should filter out OpenXacts and unneeded CommitXact/Abort. > > > Key: HIVE-26265 > URL: https://issues.apache.org/jira/browse/HIVE-26265 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: francis pang >Assignee: francis pang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > REPL DUMP is replication all OpenXacts, even when they are from other non > replicated databases. This wastes space in the dump, and ends up opening > unneeded transactions during REPL LOAD. > > Add a config property for replication that filters out OpenXact events during > REPL DUMP. During REPL LOAD, the txns can be implicitly opened when the > ALLOC_WRITE_ID is processed. For CommitTxn and AbortTxn, dump only if WRITE > ID was allocated. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26323) Expose real exception information to the client in JdbcSerDe.java
[ https://issues.apache.org/jira/browse/HIVE-26323?focusedWorklogId=780974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780974 ] ASF GitHub Bot logged work on HIVE-26323: - Author: ASF GitHub Bot Created on: 14/Jun/22 03:17 Start Date: 14/Jun/22 03:17 Worklog Time Spent: 10m Work Description: zhangbutao opened a new pull request, #3364: URL: https://github.com/apache/hive/pull/3364 ### What changes were proposed in this pull request? Throw real exception massage to client in method _initialize_ of JdbcSerDe.java ### Why are the changes needed? Display friendly execption massage to the client user. Please see the details in: https://issues.apache.org/jira/browse/HIVE-26323 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Local test Issue Time Tracking --- Worklog Id: (was: 780974) Remaining Estimate: 0h Time Spent: 10m > Expose real exception information to the client in JdbcSerDe.java > - > > Key: HIVE-26323 > URL: https://issues.apache.org/jira/browse/HIVE-26323 > Project: Hive > Issue Type: Bug > Components: JDBC storage handler >Affects Versions: 4.0.0-alpha-2 >Reporter: zhangbutao >Assignee: zhangbutao >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Method *_initialize_* in JdbcSerDe.java, always return the same exception > massage to the client no matter what problems happen. > {code:java} > } catch (Exception e) { > throw new SerDeException("Caught exception while initializing the > SqlSerDe", e); > } {code} > We should expose real execption massage to the client. > This is a regression from HIVE-24560. > > Step to repro: > 1. create a jdbc table using incorrect mysql passwd or using incorrect mysql > host: > {code:java} > CREATE EXTERNAL TABLE jdbc_testtbl > ( > id bigint > ) > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' > TBLPROPERTIES ( > "hive.sql.database.type" = "MYSQL", > "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", > "hive.sql.jdbc.url" = "jdbc:mysql://localhost:3306/testdb", > "hive.sql.dbcp.username" = "root", > "hive.sql.dbcp.password" = "password", > "hive.sql.table" = "mysqltbl", > "hive.sql.dbcp.maxActive" = "1" > ); {code} > 2. beeline client always display same exception massage no matter incorrect > mysql passwd or incorrect mysql host: > {code:java} > INFO : Starting task [Stage-0:DDL] in serial mode > ERROR : Failed > org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: > MetaException(message:org.apache.hadoop.hive.serde2.SerDeException Caught > exception while initializing the SqlSerDe) > at > org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1343) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1348) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:141) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:99) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:343) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] >
[jira] [Updated] (HIVE-26323) Expose real exception information to the client in JdbcSerDe.java
[ https://issues.apache.org/jira/browse/HIVE-26323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26323: -- Labels: pull-request-available (was: ) > Expose real exception information to the client in JdbcSerDe.java > - > > Key: HIVE-26323 > URL: https://issues.apache.org/jira/browse/HIVE-26323 > Project: Hive > Issue Type: Bug > Components: JDBC storage handler >Affects Versions: 4.0.0-alpha-2 >Reporter: zhangbutao >Assignee: zhangbutao >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Method *_initialize_* in JdbcSerDe.java, always return the same exception > massage to the client no matter what problems happen. > {code:java} > } catch (Exception e) { > throw new SerDeException("Caught exception while initializing the > SqlSerDe", e); > } {code} > We should expose real execption massage to the client. > This is a regression from HIVE-24560. > > Step to repro: > 1. create a jdbc table using incorrect mysql passwd or using incorrect mysql > host: > {code:java} > CREATE EXTERNAL TABLE jdbc_testtbl > ( > id bigint > ) > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' > TBLPROPERTIES ( > "hive.sql.database.type" = "MYSQL", > "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", > "hive.sql.jdbc.url" = "jdbc:mysql://localhost:3306/testdb", > "hive.sql.dbcp.username" = "root", > "hive.sql.dbcp.password" = "password", > "hive.sql.table" = "mysqltbl", > "hive.sql.dbcp.maxActive" = "1" > ); {code} > 2. beeline client always display same exception massage no matter incorrect > mysql passwd or incorrect mysql host: > {code:java} > INFO : Starting task [Stage-0:DDL] in serial mode > ERROR : Failed > org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: > MetaException(message:org.apache.hadoop.hive.serde2.SerDeException Caught > exception while initializing the SqlSerDe) > at > org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1343) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1348) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:141) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:99) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:343) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233) > ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88) > ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at >
[jira] [Updated] (HIVE-26323) Expose real exception information to the client in JdbcSerDe.java
[ https://issues.apache.org/jira/browse/HIVE-26323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangbutao updated HIVE-26323: -- Description: Method *_initialize_* in JdbcSerDe.java, always return the same exception massage to the client no matter what problems happen. {code:java} } catch (Exception e) { throw new SerDeException("Caught exception while initializing the SqlSerDe", e); } {code} We should expose real execption massage to the client. This is a regression from HIVE-24560. Step to repro: 1. create a jdbc table using incorrect mysql passwd or using incorrect mysql host: {code:java} CREATE EXTERNAL TABLE jdbc_testtbl ( id bigint ) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ( "hive.sql.database.type" = "MYSQL", "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", "hive.sql.jdbc.url" = "jdbc:mysql://localhost:3306/testdb", "hive.sql.dbcp.username" = "root", "hive.sql.dbcp.password" = "password", "hive.sql.table" = "mysqltbl", "hive.sql.dbcp.maxActive" = "1" ); {code} 2. beeline client always display same exception massage no matter incorrect mysql passwd or incorrect mysql host: {code:java} INFO : Starting task [Stage-0:DDL] in serial mode ERROR : Failed org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException Caught exception while initializing the SqlSerDe) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1343) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1348) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:141) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:99) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:343) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) ~[hadoop-common-3.1.0-bc3.2.0.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:356) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112] at
[jira] [Updated] (HIVE-26323) Expose real exception information to the client in JdbcSerDe.java
[ https://issues.apache.org/jira/browse/HIVE-26323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangbutao updated HIVE-26323: -- Description: Method *_initialize_* in JdbcSerDe.java, always return the same exception massage to the client no matter what problems happen. {code:java} } catch (Exception e) { throw new SerDeException("Caught exception while initializing the SqlSerDe", e); } {code} We should expose real execption massage to the client. This is a regression from HIVE-24560. Step to repro: 1. create a jdbc table using incorrect mysql passwd or using incorrect mysql host: {code:java} CREATE EXTERNAL TABLE jdbc_testtbl ( id bigint ) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ( "hive.sql.database.type" = "MYSQL", "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", "hive.sql.jdbc.url" = "jdbc:mysql://localhost:3306/testdb", "hive.sql.dbcp.username" = "root", "hive.sql.dbcp.password" = "password", "hive.sql.table" = "mysqltbl", "hive.sql.dbcp.maxActive" = "1" ); {code} 2. beeline client always display same exception massage no matter incorrect mysq passwd or incorrect host: {code:java} INFO : Starting task [Stage-0:DDL] in serial mode ERROR : Failed org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException Caught exception while initializing the SqlSerDe) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1343) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1348) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:141) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:99) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:343) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) ~[hadoop-common-3.1.0-bc3.2.0.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:356) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112] at
[jira] [Updated] (HIVE-26323) Expose real exception information to the client in JdbcSerDe.java
[ https://issues.apache.org/jira/browse/HIVE-26323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangbutao updated HIVE-26323: -- Description: Method *_initialize_* in JdbcSerDe.java, always return the same exception massage to the client no matter what problems happen. {code:java} } catch (Exception e) { throw new SerDeException("Caught exception while initializing the SqlSerDe", e); } {code} We should expose real execption massage to the client. This is a regression from HIVE-24560. Step to repro: 1. create a jdbc table using incorrect mysql passwd or using incorrect mysql host: {code:java} CREATE EXTERNAL TABLE jdbc_testtbl ( id bigint ) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ( "hive.sql.database.type" = "MYSQL", "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", "hive.sql.jdbc.url" = "jdbc:mysql://localhost:3306/testdb", "hive.sql.dbcp.username" = "root", "hive.sql.dbcp.password" = "password", "hive.sql.table" = "mysqltbl", "hive.sql.dbcp.maxActive" = "1" ); {code} 2. beeline client always display same execption massage no matter incorrect mysq passwd or incorrect host: {code:java} INFO : Starting task [Stage-0:DDL] in serial mode ERROR : Failed org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException Caught exception while initializing the SqlSerDe) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1343) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1348) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:141) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:99) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:343) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) ~[hadoop-common-3.1.0-bc3.2.0.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:356) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112] at
[jira] [Assigned] (HIVE-26323) Expose real exception information to the client in JdbcSerDe.java
[ https://issues.apache.org/jira/browse/HIVE-26323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangbutao reassigned HIVE-26323: - Assignee: zhangbutao > Expose real exception information to the client in JdbcSerDe.java > - > > Key: HIVE-26323 > URL: https://issues.apache.org/jira/browse/HIVE-26323 > Project: Hive > Issue Type: Bug > Components: JDBC storage handler >Affects Versions: 4.0.0-alpha-2 >Reporter: zhangbutao >Assignee: zhangbutao >Priority: Major > > Method *_initialize_* in JdbcSerDe.java, always return the same exception > massage to the client no matter what problems happen. > {code:java} > } catch (Exception e) { > throw new SerDeException("Caught exception while initializing the > SqlSerDe", e); > } {code} > We should expose real execption massage to the client. > This is a regression from HIVE-24560. > > Step to repro: > 1. create a jdbc table using incorrect mysql passwd or using incorrect mysql > host: > {code:java} > CREATE EXTERNAL TABLE jdbc_testtbl > ( > id bigint > ) > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' > TBLPROPERTIES ( > "hive.sql.database.type" = "MYSQL", > "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", > "hive.sql.jdbc.url" = "jdbc:mysql://localhost:3306/testdb", > "hive.sql.dbcp.username" = "root", > "hive.sql.dbcp.password" = "password", > "hive.sql.table" = "mysqltbl", > "hive.sql.dbcp.maxActive" = "1" > ); {code} > 2. beeline client always display same execption massage no matter incorrect > mysq passwd or incorrect host: > {code:java} > INFO : Starting task [Stage-0:DDL] in serial mode > ERROR : Failed > org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: > MetaException(message:org.apache.hadoop.hive.serde2.SerDeException Caught > exception while initializing the SqlSerDe) > at > org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1343) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1348) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:141) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:99) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:343) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) > ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233) > ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88) > ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336) > ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] > at
[jira] [Updated] (HIVE-26323) Expose real exception information to the client in JdbcSerDe.java
[ https://issues.apache.org/jira/browse/HIVE-26323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangbutao updated HIVE-26323: -- Description: Method *_initialize_* in JdbcSerDe.java, always return the same exception massage to the client no matter what problems happen. {code:java} } catch (Exception e) { throw new SerDeException("Caught exception while initializing the SqlSerDe", e); } {code} We should expose real execption massage to the client. This is a regression from HIVE-24560. Step to repro: 1. create a jdbc table using incorrect mysql passwd or using incorrect mysql host: {code:java} CREATE EXTERNAL TABLE jdbc_testtbl ( id bigint ) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ( "hive.sql.database.type" = "MYSQL", "hive.sql.jdbc.driver" = "com.mysql.jdbc.Driver", "hive.sql.jdbc.url" = "jdbc:mysql://localhost:3306/testdb", "hive.sql.dbcp.username" = "root", "hive.sql.dbcp.password" = "password", "hive.sql.table" = "mysqltbl", "hive.sql.dbcp.maxActive" = "1" ); {code} 2. beeline client always display same execption massage no matter incoorect mysq passwd or incorrect host: {code:java} INFO : Starting task [Stage-0:DDL] in serial mode ERROR : Failed org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException Caught exception while initializing the SqlSerDe) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1343) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:1348) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableNonReplaceMode(CreateTableOperation.java:141) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:99) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:354) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:327) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:244) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:105) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:343) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:205) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:154) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) ~[hadoop-common-3.1.0-bc3.2.0.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:356) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112] at
[jira] [Work logged] (HIVE-26269) Class cast exception when vectorization is enabled for certain case when cases
[ https://issues.apache.org/jira/browse/HIVE-26269?focusedWorklogId=780970=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780970 ] ASF GitHub Bot logged work on HIVE-26269: - Author: ASF GitHub Bot Created on: 14/Jun/22 02:47 Start Date: 14/Jun/22 02:47 Worklog Time Spent: 10m Work Description: ramesh0201 commented on code in PR #3329: URL: https://github.com/apache/hive/pull/3329#discussion_r896323787 ## ql/src/test/queries/clientpositive/vector_case_when_3.q: ## @@ -0,0 +1,7 @@ +set hive.explain.user=false; +set hive.fetch.task.conversion=none; +set hive.vectorized.execution.enabled=true; +create external table test_decimal(rattag string, newclt_all decimal(15,2)) stored as orc; +insert into test_decimal values('a', '10.20'); +select sum(case when rattag='a' then newclt_all*0.3 else newclt_all end) from test_decimal; +select sum(case when rattag='Y' then newclt_all*0.3 else newclt_all end) from test_decimal; Review Comment: Yes that will be useful to see what is happening. I added. Issue Time Tracking --- Worklog Id: (was: 780970) Time Spent: 40m (was: 0.5h) > Class cast exception when vectorization is enabled for certain case when cases > -- > > Key: HIVE-26269 > URL: https://issues.apache.org/jira/browse/HIVE-26269 > Project: Hive > Issue Type: Bug >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Class cast exception when vectorization is enabled for certain case when cases -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26184) COLLECT_SET with GROUP BY is very slow when some keys are highly skewed
[ https://issues.apache.org/jira/browse/HIVE-26184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553854#comment-17553854 ] okumin commented on HIVE-26184: --- Thanks a lot! > COLLECT_SET with GROUP BY is very slow when some keys are highly skewed > --- > > Key: HIVE-26184 > URL: https://issues.apache.org/jira/browse/HIVE-26184 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.8, 3.1.3 >Reporter: okumin >Assignee: okumin >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-2 > > Time Spent: 1.5h > Remaining Estimate: 0h > > I observed some reducers spend 98% of CPU time in invoking > `java.util.HashMap#clear`. > Looking the detail, I found COLLECT_SET reuses a LinkedHashSet and its > `clear` can be quite heavy when a relation has a small number of highly > skewed keys. > > To reproduce the issue, first, we will create rows with a skewed key. > {code:java} > INSERT INTO test_collect_set > SELECT '----' AS key, CAST(UUID() AS VARCHAR) > AS value > FROM table_with_many_rows > LIMIT 10;{code} > Then, we will create many non-skewed rows. > {code:java} > INSERT INTO test_collect_set > SELECT UUID() AS key, UUID() AS value > FROM table_with_many_rows > LIMIT 500;{code} > We can observe the issue when we aggregate values by `key`. > {code:java} > SELECT key, COLLECT_SET(value) FROM group_by_skew GROUP BY key{code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-24375) Cannot drop parttions when using metastore standalone
[ https://issues.apache.org/jira/browse/HIVE-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553853#comment-17553853 ] Wang Jiangkun commented on HIVE-24375: -- I have this problem too, can anyone solve it? My current solution is to package hive-exec in the standalonemetastore and modify the configuration in the standalonemetastore hive.metastore.expression.proxy=org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore to avoid this problem [ashutoshc |https://github.com/ashutoshc] h3. [sershe-apache|https://github.com/sershe-apache] h3. > Cannot drop parttions when using metastore standalone > - > > Key: HIVE-24375 > URL: https://issues.apache.org/jira/browse/HIVE-24375 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.1.2 >Reporter: Alex Simenduev >Priority: Major > > I can successfully connect to metastore using beeline client. > {code:java} > beeline --fastConnect=true -u jdbc:hive2:// > {code} > The I run partition drop command for a table > {code:java} > ALTER TABLE work_v3 DROP PARTITION (user=992, `date`=20200806);{code} > I get error that partition not found, but the partition does exists according > to show partitions. > Here is full stacktrace. > > {code} > 0: jdbc:hive2://> alter table work_v3 drop partition (user=992, > `date`=20200806);0: jdbc:hive2://> alter table work_v3 drop partition > (user=992, `date`=20200806);20/11/12 09:55:00 [main]: INFO conf.HiveConf: > Using the default value passed in for log id: > 366b6abb-87d4-4758-9cfb-fa36903f69de20/11/12 09:55:00 [main]: INFO > session.SessionState: Updating thread name to > 366b6abb-87d4-4758-9cfb-fa36903f69de main20/11/12 09:55:00 > [366b6abb-87d4-4758-9cfb-fa36903f69de main]: INFO operation.OperationManager: > Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, > getHandleIdentifier()=897f778a-37c6-444d-b244-3f701268eec7]20/11/12 09:55:00 > [366b6abb-87d4-4758-9cfb-fa36903f69de main]: INFO ql.Driver: Compiling > command(queryId=root_20201112095500_e4ca0f5b-0b00-43ae-9efb-2d0a24ceed15): > alter table work_v3 drop partition (user=992, `date`=20200806)20/11/12 > 09:55:00 [366b6abb-87d4-4758-9cfb-fa36903f69de main]: INFO ql.Driver: > Concurrency mode is disabled, not creating a lock managerFAILED: > SemanticException [Error 10006]: Partition not found ((user = 992) and (date > = 20200806))20/11/12 09:55:00 [366b6abb-87d4-4758-9cfb-fa36903f69de main]: > ERROR ql.Driver: FAILED: SemanticException [Error 10006]: Partition not found > ((user = 992) and (date = > 20200806))org.apache.hadoop.hive.ql.parse.SemanticException: Partition not > found ((user = 992) and (date = 20200806)) at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.addTableDropPartsOutputs(DDLSemanticAnalyzer.java:4028) > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableDropParts(DDLSemanticAnalyzer.java:3376) > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:327) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659) at > org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768) at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:260) > at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527) > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:312) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1585) > at com.sun.proxy.$Proxy28.ExecuteStatement(Unknown Source) at > org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:323) > at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:265)
[jira] [Updated] (HIVE-26322) Upgrade gson to 2.9.0 due to CVE
[ https://issues.apache.org/jira/browse/HIVE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26322: -- Labels: pull-request-available (was: ) > Upgrade gson to 2.9.0 due to CVE > > > Key: HIVE-26322 > URL: https://issues.apache.org/jira/browse/HIVE-26322 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26322) Upgrade gson to 2.9.0 due to CVE
[ https://issues.apache.org/jira/browse/HIVE-26322?focusedWorklogId=780961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780961 ] ASF GitHub Bot logged work on HIVE-26322: - Author: ASF GitHub Bot Created on: 14/Jun/22 01:33 Start Date: 14/Jun/22 01:33 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request, #3363: URL: https://github.com/apache/hive/pull/3363 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? Issue Time Tracking --- Worklog Id: (was: 780961) Remaining Estimate: 0h Time Spent: 10m > Upgrade gson to 2.9.0 due to CVE > > > Key: HIVE-26322 > URL: https://issues.apache.org/jira/browse/HIVE-26322 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26142) Extend the hidden conf list with webui keystore pwd
[ https://issues.apache.org/jira/browse/HIVE-26142?focusedWorklogId=780954=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780954 ] ASF GitHub Bot logged work on HIVE-26142: - Author: ASF GitHub Bot Created on: 14/Jun/22 00:23 Start Date: 14/Jun/22 00:23 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3212: URL: https://github.com/apache/hive/pull/3212#issuecomment-1154576740 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. Issue Time Tracking --- Worklog Id: (was: 780954) Time Spent: 20m (was: 10m) > Extend the hidden conf list with webui keystore pwd > > > Key: HIVE-26142 > URL: https://issues.apache.org/jira/browse/HIVE-26142 > Project: Hive > Issue Type: Bug > Components: Security >Reporter: Janos Kovacs >Assignee: Janos Kovacs >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The SSL keystore configuration is separated for HS2 itself and the WebUI. The > hidden configuration list only contains server2.keystore.password but should > also contain server2.webui.keystore.password. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25787) Prevent duplicate paths in the fileList while adding an entry to NotifcationLog
[ https://issues.apache.org/jira/browse/HIVE-25787?focusedWorklogId=780955=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780955 ] ASF GitHub Bot logged work on HIVE-25787: - Author: ASF GitHub Bot Created on: 14/Jun/22 00:23 Start Date: 14/Jun/22 00:23 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on PR #3170: URL: https://github.com/apache/hive/pull/3170#issuecomment-1154576834 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. Issue Time Tracking --- Worklog Id: (was: 780955) Time Spent: 40m (was: 0.5h) > Prevent duplicate paths in the fileList while adding an entry to > NotifcationLog > --- > > Key: HIVE-25787 > URL: https://issues.apache.org/jira/browse/HIVE-25787 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > As of now, while adding entries to notification logs, in case of retries, > sometimes the same path gets added to the notification log entry, which > during replication leads to failures during copy. > Avoid having same path more than once for single transaction. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26320) Incorrect case evaluation for Parquet based table
[ https://issues.apache.org/jira/browse/HIVE-26320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chiran Ravani updated HIVE-26320: - Issue Type: Bug (was: Improvement) > Incorrect case evaluation for Parquet based table > - > > Key: HIVE-26320 > URL: https://issues.apache.org/jira/browse/HIVE-26320 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Query Planning >Affects Versions: 4.0.0-alpha-1 >Reporter: Chiran Ravani >Priority: Major > > Query involving case statement with two or more conditions leads to incorrect > result for tables with parquet format, The problem is not observed with ORC > or TextFile. > *Steps to reproduce*: > {code:java} > create external table case_test_parquet(kob varchar(2),enhanced_type_code > int) stored as parquet; > insert into case_test_parquet values('BB',18),('BC',18),('AB',18); > select case when ( >(kob='BB' and enhanced_type_code='18') >or (kob='BC' and enhanced_type_code='18') > ) > then 1 > else 0 > end as logic_check > from case_test_parquet; > {code} > Result: > {code} > 0 > 0 > 0 > {code} > Expected result: > {code} > 1 > 1 > 0 > {code} > The problem does not appear when setting hive.optimize.point.lookup=false. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26321) Upgrade commons-io to 2.11.0
[ https://issues.apache.org/jira/browse/HIVE-26321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam reassigned HIVE-26321: > Upgrade commons-io to 2.11.0 > > > Key: HIVE-26321 > URL: https://issues.apache.org/jira/browse/HIVE-26321 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Major > > Upgrade commons-io to 2.11.0 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25879) MetaStoreDirectSql test query should not query the whole DBS table
[ https://issues.apache.org/jira/browse/HIVE-25879?focusedWorklogId=780869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780869 ] ASF GitHub Bot logged work on HIVE-25879: - Author: ASF GitHub Bot Created on: 13/Jun/22 15:30 Start Date: 13/Jun/22 15:30 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on code in PR #3348: URL: https://github.com/apache/hive/pull/3348#discussion_r895859650 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java: ## @@ -316,17 +316,20 @@ private boolean ensureDbInit() { } private boolean runTestQuery() { +boolean doTrace = LOG.isDebugEnabled(); Transaction tx = pm.currentTransaction(); boolean doCommit = false; if (!tx.isActive()) { tx.begin(); doCommit = true; } // Run a self-test query. If it doesn't work, we will self-disable. What a PITA... -String selfTestQuery = "select \"DB_ID\" from " + DBS + ""; +String selfTestQuery = "select \"DB_ID\" from " + DBS + " WHERE \"DB_ID\"=1"; Review Comment: `where 1=0` sounds good to me...or `DB_ID=-1`; but this will be ok as well Issue Time Tracking --- Worklog Id: (was: 780869) Time Spent: 1.5h (was: 1h 20m) > MetaStoreDirectSql test query should not query the whole DBS table > -- > > Key: HIVE-25879 > URL: https://issues.apache.org/jira/browse/HIVE-25879 > Project: Hive > Issue Type: Bug >Reporter: Miklos Szurap >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > The runTestQuery() in the > org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java is using a test query > {code:java} > select "DB_ID" from "DBS"{code} > to determine whether the direct SQL can be used. > With larger deployments with many (10k+) Hive databases it would be more > efficienct to query a small table instead, for example the "VERSION" table > should always have a single row only. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26184) COLLECT_SET with GROUP BY is very slow when some keys are highly skewed
[ https://issues.apache.org/jira/browse/HIVE-26184?focusedWorklogId=780865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780865 ] ASF GitHub Bot logged work on HIVE-26184: - Author: ASF GitHub Bot Created on: 13/Jun/22 15:26 Start Date: 13/Jun/22 15:26 Worklog Time Spent: 10m Work Description: kgyrtkirk merged PR #3253: URL: https://github.com/apache/hive/pull/3253 Issue Time Tracking --- Worklog Id: (was: 780865) Time Spent: 1.5h (was: 1h 20m) > COLLECT_SET with GROUP BY is very slow when some keys are highly skewed > --- > > Key: HIVE-26184 > URL: https://issues.apache.org/jira/browse/HIVE-26184 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.8, 3.1.3 >Reporter: okumin >Assignee: okumin >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > I observed some reducers spend 98% of CPU time in invoking > `java.util.HashMap#clear`. > Looking the detail, I found COLLECT_SET reuses a LinkedHashSet and its > `clear` can be quite heavy when a relation has a small number of highly > skewed keys. > > To reproduce the issue, first, we will create rows with a skewed key. > {code:java} > INSERT INTO test_collect_set > SELECT '----' AS key, CAST(UUID() AS VARCHAR) > AS value > FROM table_with_many_rows > LIMIT 10;{code} > Then, we will create many non-skewed rows. > {code:java} > INSERT INTO test_collect_set > SELECT UUID() AS key, UUID() AS value > FROM table_with_many_rows > LIMIT 500;{code} > We can observe the issue when we aggregate values by `key`. > {code:java} > SELECT key, COLLECT_SET(value) FROM group_by_skew GROUP BY key{code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26184) COLLECT_SET with GROUP BY is very slow when some keys are highly skewed
[ https://issues.apache.org/jira/browse/HIVE-26184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-26184. - Fix Version/s: 4.0.0-alpha-2 Resolution: Fixed merged into master. Thank you [~okumin] ! > COLLECT_SET with GROUP BY is very slow when some keys are highly skewed > --- > > Key: HIVE-26184 > URL: https://issues.apache.org/jira/browse/HIVE-26184 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.8, 3.1.3 >Reporter: okumin >Assignee: okumin >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-2 > > Time Spent: 1.5h > Remaining Estimate: 0h > > I observed some reducers spend 98% of CPU time in invoking > `java.util.HashMap#clear`. > Looking the detail, I found COLLECT_SET reuses a LinkedHashSet and its > `clear` can be quite heavy when a relation has a small number of highly > skewed keys. > > To reproduce the issue, first, we will create rows with a skewed key. > {code:java} > INSERT INTO test_collect_set > SELECT '----' AS key, CAST(UUID() AS VARCHAR) > AS value > FROM table_with_many_rows > LIMIT 10;{code} > Then, we will create many non-skewed rows. > {code:java} > INSERT INTO test_collect_set > SELECT UUID() AS key, UUID() AS value > FROM table_with_many_rows > LIMIT 500;{code} > We can observe the issue when we aggregate values by `key`. > {code:java} > SELECT key, COLLECT_SET(value) FROM group_by_skew GROUP BY key{code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26307) Avoid FS init in FileIO::newInputFile in vectorized Iceberg reads
[ https://issues.apache.org/jira/browse/HIVE-26307?focusedWorklogId=780861=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780861 ] ASF GitHub Bot logged work on HIVE-26307: - Author: ASF GitHub Bot Created on: 13/Jun/22 15:20 Start Date: 13/Jun/22 15:20 Worklog Time Spent: 10m Work Description: pvary merged PR #3354: URL: https://github.com/apache/hive/pull/3354 Issue Time Tracking --- Worklog Id: (was: 780861) Time Spent: 40m (was: 0.5h) > Avoid FS init in FileIO::newInputFile in vectorized Iceberg reads > - > > Key: HIVE-26307 > URL: https://issues.apache.org/jira/browse/HIVE-26307 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > With vectorized Iceberg reads we are creating {{HadoopInputFile}} objects > just to store the location of the files. If we can avoid this, then we can > improve the performance, since the {{path.getFileSystem(conf)}} calls can > become costly, especially for S3 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26307) Avoid FS init in FileIO::newInputFile in vectorized Iceberg reads
[ https://issues.apache.org/jira/browse/HIVE-26307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary resolved HIVE-26307. --- Fix Version/s: 4.0.0 Resolution: Fixed Pushed to master. Thanks for the review [~szita]! > Avoid FS init in FileIO::newInputFile in vectorized Iceberg reads > - > > Key: HIVE-26307 > URL: https://issues.apache.org/jira/browse/HIVE-26307 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > With vectorized Iceberg reads we are creating {{HadoopInputFile}} objects > just to store the location of the files. If we can avoid this, then we can > improve the performance, since the {{path.getFileSystem(conf)}} calls can > become costly, especially for S3 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26307) Avoid FS init in FileIO::newInputFile in vectorized Iceberg reads
[ https://issues.apache.org/jira/browse/HIVE-26307?focusedWorklogId=780857=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780857 ] ASF GitHub Bot logged work on HIVE-26307: - Author: ASF GitHub Bot Created on: 13/Jun/22 15:07 Start Date: 13/Jun/22 15:07 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3354: URL: https://github.com/apache/hive/pull/3354#discussion_r895836024 ## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java: ## @@ -381,100 +400,67 @@ private CloseableIterable newAvroIterable( Avro.ReadBuilder avroReadBuilder = Avro.read(inputFile) .project(readSchema) .split(task.start(), task.length()); + Review Comment: The first check in `openVectorized` is this: ``` Preconditions.checkArgument(!task.file().format().equals(FileFormat.AVRO), "Vectorized execution is not yet supported for Iceberg avro tables. " + "Please turn off vectorization and retry the query."); ``` Issue Time Tracking --- Worklog Id: (was: 780857) Time Spent: 0.5h (was: 20m) > Avoid FS init in FileIO::newInputFile in vectorized Iceberg reads > - > > Key: HIVE-26307 > URL: https://issues.apache.org/jira/browse/HIVE-26307 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > With vectorized Iceberg reads we are creating {{HadoopInputFile}} objects > just to store the location of the files. If we can avoid this, then we can > improve the performance, since the {{path.getFileSystem(conf)}} calls can > become costly, especially for S3 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25733) Add check-spelling CI action
[ https://issues.apache.org/jira/browse/HIVE-25733?focusedWorklogId=780855=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780855 ] ASF GitHub Bot logged work on HIVE-25733: - Author: ASF GitHub Bot Created on: 13/Jun/22 15:06 Start Date: 13/Jun/22 15:06 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on PR #2809: URL: https://github.com/apache/hive/pull/2809#issuecomment-1154041239 only the mysql/metastore test have failed; for which to pass - we would have to merge the master branch and run the tests once more...this PR had a clean run before - just wanted to be sure that its still good. Issue Time Tracking --- Worklog Id: (was: 780855) Time Spent: 1h 20m (was: 1h 10m) > Add check-spelling CI action > > > Key: HIVE-25733 > URL: https://issues.apache.org/jira/browse/HIVE-25733 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Reporter: Josh Soref >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Add CI to catch spelling errors. See [https://www.check-spelling.dev/] for > information. > Initially this will only check the {{serde}} directory, but the intention is > to expand its coverage as spelling errors in other directories are fixed. > Note that for this to work the action should be made a required check, > otherwise when a typo is added forks from that commit will get complaints. > If a typo is intentional, the action will provide information about how to > add it to {{expect.txt}} such that it will be accepted as an expected item > (i.e. not a typo). > To skip a file/directory entirely, add a matching entry to > {{{}excludes.txt{}}}. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25733) Add check-spelling CI action
[ https://issues.apache.org/jira/browse/HIVE-25733?focusedWorklogId=780854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780854 ] ASF GitHub Bot logged work on HIVE-25733: - Author: ASF GitHub Bot Created on: 13/Jun/22 15:05 Start Date: 13/Jun/22 15:05 Worklog Time Spent: 10m Work Description: kgyrtkirk merged PR #2809: URL: https://github.com/apache/hive/pull/2809 Issue Time Tracking --- Worklog Id: (was: 780854) Time Spent: 1h 10m (was: 1h) > Add check-spelling CI action > > > Key: HIVE-25733 > URL: https://issues.apache.org/jira/browse/HIVE-25733 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Reporter: Josh Soref >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Add CI to catch spelling errors. See [https://www.check-spelling.dev/] for > information. > Initially this will only check the {{serde}} directory, but the intention is > to expand its coverage as spelling errors in other directories are fixed. > Note that for this to work the action should be made a required check, > otherwise when a typo is added forks from that commit will get complaints. > If a typo is intentional, the action will provide information about how to > add it to {{expect.txt}} such that it will be accepted as an expected item > (i.e. not a typo). > To skip a file/directory entirely, add a matching entry to > {{{}excludes.txt{}}}. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25733) Add check-spelling CI action
[ https://issues.apache.org/jira/browse/HIVE-25733?focusedWorklogId=780853=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780853 ] ASF GitHub Bot logged work on HIVE-25733: - Author: ASF GitHub Bot Created on: 13/Jun/22 15:04 Start Date: 13/Jun/22 15:04 Worklog Time Spent: 10m Work Description: jsoref commented on PR #2809: URL: https://github.com/apache/hive/pull/2809#issuecomment-1154038518 Sigh Issue Time Tracking --- Worklog Id: (was: 780853) Time Spent: 1h (was: 50m) > Add check-spelling CI action > > > Key: HIVE-25733 > URL: https://issues.apache.org/jira/browse/HIVE-25733 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Reporter: Josh Soref >Priority: Minor > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Add CI to catch spelling errors. See [https://www.check-spelling.dev/] for > information. > Initially this will only check the {{serde}} directory, but the intention is > to expand its coverage as spelling errors in other directories are fixed. > Note that for this to work the action should be made a required check, > otherwise when a typo is added forks from that commit will get complaints. > If a typo is intentional, the action will provide information about how to > add it to {{expect.txt}} such that it will be accepted as an expected item > (i.e. not a typo). > To skip a file/directory entirely, add a matching entry to > {{{}excludes.txt{}}}. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26320) Incorrect case evaluation for Parquet based table
[ https://issues.apache.org/jira/browse/HIVE-26320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chiran Ravani updated HIVE-26320: - Summary: Incorrect case evaluation for Parquet based table (was: Incorrect case evaluation for Parquet based table.) > Incorrect case evaluation for Parquet based table > - > > Key: HIVE-26320 > URL: https://issues.apache.org/jira/browse/HIVE-26320 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, Query Planning >Affects Versions: 4.0.0-alpha-1 >Reporter: Chiran Ravani >Priority: Major > > Query involving case statement with two or more conditions leads to incorrect > result for tables with parquet format, The problem is not observed with ORC > or TextFile. > *Steps to reproduce*: > {code:java} > create external table case_test_parquet(kob varchar(2),enhanced_type_code > int) stored as parquet; > insert into case_test_parquet values('BB',18),('BC',18),('AB',18); > select case when ( >(kob='BB' and enhanced_type_code='18') >or (kob='BC' and enhanced_type_code='18') > ) > then 1 > else 0 > end as logic_check > from case_test_parquet; > {code} > Result: > {code} > 0 > 0 > 0 > {code} > Expected result: > {code} > 1 > 1 > 0 > {code} > The problem does not appear when setting hive.optimize.point.lookup=false. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.3
[ https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=780830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780830 ] ASF GitHub Bot logged work on HIVE-24484: - Author: ASF GitHub Bot Created on: 13/Jun/22 13:51 Start Date: 13/Jun/22 13:51 Worklog Time Spent: 10m Work Description: ayushtkn commented on code in PR #3279: URL: https://github.com/apache/hive/pull/3279#discussion_r895742321 ## ql/src/test/results/clientpositive/llap/acid_table_directories_test.q.out: ## @@ -163,13 +170,6 @@ POSTHOOK: Input: default@acidparttbl@p=200 ### ACID DELTA DIR ### ### ACID DELTA DIR ### ### ACID DELTA DIR ### - A masked pattern was here Review Comment: [HADOOP-12502](https://issues.apache.org/jira/browse/HADOOP-12502) is the culprit and earlier listing was sorted now it isn't so the Ls -R output will change intermittently so we can't have this test only hence disabled Issue Time Tracking --- Worklog Id: (was: 780830) Time Spent: 12h 23m (was: 12h 13m) > Upgrade Hadoop to 3.3.3 > --- > > Key: HIVE-24484 > URL: https://issues.apache.org/jira/browse/HIVE-24484 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 12h 23m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26307) Avoid FS init in FileIO::newInputFile in vectorized Iceberg reads
[ https://issues.apache.org/jira/browse/HIVE-26307?focusedWorklogId=780822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780822 ] ASF GitHub Bot logged work on HIVE-26307: - Author: ASF GitHub Bot Created on: 13/Jun/22 13:37 Start Date: 13/Jun/22 13:37 Worklog Time Spent: 10m Work Description: szlta commented on code in PR #3354: URL: https://github.com/apache/hive/pull/3354#discussion_r895723865 ## iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java: ## @@ -381,100 +400,67 @@ private CloseableIterable newAvroIterable( Avro.ReadBuilder avroReadBuilder = Avro.read(inputFile) .project(readSchema) .split(task.start(), task.length()); + Review Comment: nit: Should we keep the original exception ("Vectorized execution is not yet supported for Iceberg avro...") here in case the inMemoryDataModel==HIVE ? In theory HiveIcebergStorageHandler should prevent such combination, just wanted to know if this was a conscious decision. Issue Time Tracking --- Worklog Id: (was: 780822) Time Spent: 20m (was: 10m) > Avoid FS init in FileIO::newInputFile in vectorized Iceberg reads > - > > Key: HIVE-26307 > URL: https://issues.apache.org/jira/browse/HIVE-26307 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > With vectorized Iceberg reads we are creating {{HadoopInputFile}} objects > just to store the location of the files. If we can avoid this, then we can > improve the performance, since the {{path.getFileSystem(conf)}} calls can > become costly, especially for S3 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553584#comment-17553584 ] Chiran Ravani commented on HIVE-25980: -- [~ste...@apache.org] Thank you for sharing the idea. I can open another Jira to track this improvement, the goal for this one was to reduce unnecessary FS calls which we were making as part of HiveMetaStoreChecker.checkTable. > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 8h 20m > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at > com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) > at > com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1331) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) > at >
[jira] [Commented] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553577#comment-17553577 ] Steve Loughran commented on HIVE-25980: --- ok. I'd still recommend the method {{listStatusIterator}} for paginated retrieval of wide listings from hdfs, s3a and abfs, especially if you can do any useful work during that iteration > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 8h 20m > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at > com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) > at > com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1331) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
[jira] [Commented] (HIVE-26311) Incorrect content of array when IN operator is in the filter
[ https://issues.apache.org/jira/browse/HIVE-26311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553572#comment-17553572 ] Peter Vary commented on HIVE-26311: --- [~gaborkaszab]: Would HIVE-26250 help? > Incorrect content of array when IN operator is in the filter > > > Key: HIVE-26311 > URL: https://issues.apache.org/jira/browse/HIVE-26311 > Project: Hive > Issue Type: Bug >Reporter: Gabor Kaszab >Priority: Major > Labels: correctness > Attachments: arrays.parq > > > select id, arr1, arr2 from functional_parquet.complextypes_arrays where id % > 2 = 1 and id = 5 > {code:java} > +-+---+---+ > | id | arr1 | arr2 | > +-+---+---+ > | 5 | [10,null,12] | ["ten","eleven","twelve","thirteen"] | > +-+---+---+{code} > select id, arr1, arr2 from functional_parquet.complextypes_arrays where id % > 2 = 1 *and id in (select id from functional_parquet.alltypestiny)* and id = 5; > {code:java} > +-+-+---+ > | id | arr1 | arr2 | > +-+-+---+ > | 5 | [10,10,12] | ["ten","eleven","twelve","thirteen"] | > +-+-+---+ {code} > Note, the first (and correct) example returns 10, null and 12 as the items of > an array while the second query for some reaon shows 10 instead of the null > value. The only difference between the 2 examples is that in the second I > added an extra filter (that in fact doesn't filter out anything as > functional_parquet.alltypestiny's ID contains numbers from zero to ten) > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26319) Iceberg integration: Perform update split early
[ https://issues.apache.org/jira/browse/HIVE-26319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-26319: -- Component/s: File Formats > Iceberg integration: Perform update split early > --- > > Key: HIVE-26319 > URL: https://issues.apache.org/jira/browse/HIVE-26319 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Extend update split early to iceberg tables like in HIVE-21160 for native > acid tables -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26319) Iceberg integration: Perform update split early
[ https://issues.apache.org/jira/browse/HIVE-26319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-26319: -- Fix Version/s: 4.0.0 > Iceberg integration: Perform update split early > --- > > Key: HIVE-26319 > URL: https://issues.apache.org/jira/browse/HIVE-26319 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Extend update split early to iceberg tables like in HIVE-21160 for native > acid tables -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26319) Iceberg integration: Perform update split early
[ https://issues.apache.org/jira/browse/HIVE-26319?focusedWorklogId=780788=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780788 ] ASF GitHub Bot logged work on HIVE-26319: - Author: ASF GitHub Bot Created on: 13/Jun/22 11:46 Start Date: 13/Jun/22 11:46 Worklog Time Spent: 10m Work Description: kasakrisz opened a new pull request, #3362: URL: https://github.com/apache/hive/pull/3362 ### What changes were proposed in this pull request? Rewrite update statements of iceberg tables to multi insert statement similarly in case of native acid tables. When generating the rewritten statement: * Get the virtual columns from the table's storage handler in case of non native acid tables * Include the old values to the select clause of the delete branch of the multi insert statement. When executing the multi insert: * Two iceberg writers are used which produce a data delta file and a delete delta file. The result of these writers should be merged into one `FilesForCommit` if both writers are run in the same task. * In case of more complex statements (ex. partitioned and/or bucketed) more than one Tez task produces commit info so this patch enables storing all of them. * Every `FileSinkOperator` creates its own jobConf instance because the iceberg write operation is stored in it and it is different in both instance. ### Why are the changes needed? See #2855 + Preparation for iceberg Merge implementation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? ``` mvn test -Dtest.output.overwrite -DskipSparkTests -Dtest=TestIcebergLlapLocalCliDriver -Dqfile=update_iceberg_partitioned_orc2.q -pl itests/qtest-iceberg -Piceberg -Pitests ``` Issue Time Tracking --- Worklog Id: (was: 780788) Remaining Estimate: 0h Time Spent: 10m > Iceberg integration: Perform update split early > --- > > Key: HIVE-26319 > URL: https://issues.apache.org/jira/browse/HIVE-26319 > Project: Hive > Issue Type: Improvement >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Extend update split early to iceberg tables like in HIVE-21160 for native > acid tables -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26319) Iceberg integration: Perform update split early
[ https://issues.apache.org/jira/browse/HIVE-26319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26319: -- Labels: pull-request-available (was: ) > Iceberg integration: Perform update split early > --- > > Key: HIVE-26319 > URL: https://issues.apache.org/jira/browse/HIVE-26319 > Project: Hive > Issue Type: Improvement >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Extend update split early to iceberg tables like in HIVE-21160 for native > acid tables -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (HIVE-26319) Iceberg integration: Perform update split early
[ https://issues.apache.org/jira/browse/HIVE-26319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa reassigned HIVE-26319: - > Iceberg integration: Perform update split early > --- > > Key: HIVE-26319 > URL: https://issues.apache.org/jira/browse/HIVE-26319 > Project: Hive > Issue Type: Improvement >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > Extend update split early to iceberg tables like in HIVE-21160 for native > acid tables -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=780770=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780770 ] ASF GitHub Bot logged work on HIVE-25980: - Author: ASF GitHub Bot Created on: 13/Jun/22 09:56 Start Date: 13/Jun/22 09:56 Worklog Time Spent: 10m Work Description: shameersss1 commented on code in PR #3053: URL: https://github.com/apache/hive/pull/3053#discussion_r895538935 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java: ## @@ -326,10 +356,26 @@ void checkTable(Table table, PartitionIterable parts, byte[] filterExp, CheckRes CheckResult.PartitionResult prFromMetastore = new CheckResult.PartitionResult(); prFromMetastore.setPartitionName(getPartitionName(table, partition)); prFromMetastore.setTableName(partition.getTableName()); - if (!fs.exists(partPath)) { -result.getPartitionsNotOnFs().add(prFromMetastore); - } else { + if (allPartDirs.contains(partPath)) { result.getCorrectPartitions().add(prFromMetastore); +allPartDirs.remove(partPath); + } else { +// There can be edge case where user can define partition directory outside of table directory +// to avoid eviction of such partitions +// we check existence of partition path which are not in table directory +// and add to result for getPartitionsNotOnFs. +if (!partPath.toString().contains(tablePathStr)) { Review Comment: can't we have a simple check for this? like `if (!fs.exists(partPath)) { result.getPartitionsNotOnFs().add(prFromMetastore); } else { result.getCorrectPartitions().add(prFromMetastore); }` Issue Time Tracking --- Worklog Id: (was: 780770) Time Spent: 8h 20m (was: 8h 10m) > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 8h 20m > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) > at >
[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=780769=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780769 ] ASF GitHub Bot logged work on HIVE-25980: - Author: ASF GitHub Bot Created on: 13/Jun/22 09:51 Start Date: 13/Jun/22 09:51 Worklog Time Spent: 10m Work Description: shameersss1 commented on code in PR #3053: URL: https://github.com/apache/hive/pull/3053#discussion_r895533968 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java: ## @@ -326,10 +356,26 @@ void checkTable(Table table, PartitionIterable parts, byte[] filterExp, CheckRes CheckResult.PartitionResult prFromMetastore = new CheckResult.PartitionResult(); prFromMetastore.setPartitionName(getPartitionName(table, partition)); prFromMetastore.setTableName(partition.getTableName()); - if (!fs.exists(partPath)) { -result.getPartitionsNotOnFs().add(prFromMetastore); - } else { + if (allPartDirs.contains(partPath)) { Review Comment: allPartDirs.remove(partPath); will return true if the element exits and false otherwise. we can have one statement instead of contains and then remove Issue Time Tracking --- Worklog Id: (was: 780769) Time Spent: 8h 10m (was: 8h) > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 8h 10m > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at > com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at >
[jira] [Updated] (HIVE-26298) Selecting complex types on migrated iceberg table does not work
[ https://issues.apache.org/jira/browse/HIVE-26298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-26298: -- Labels: pull-request-available (was: ) > Selecting complex types on migrated iceberg table does not work > --- > > Key: HIVE-26298 > URL: https://issues.apache.org/jira/browse/HIVE-26298 > Project: Hive > Issue Type: Bug >Reporter: Gergely Fürnstáhl >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Attachments: 1-a5d522f4-a065-44e6-983b-ba66596b4332.metadata.json > > Time Spent: 10m > Remaining Estimate: 0h > > I am working on implementing NameMapping in Impala (mainly replicating Hive's > functionality) and ran into the following issue: > {code:java} > CREATE TABLE array_demo > ( > int_primitive INT, > int_array ARRAY, > int_array_array ARRAY>, > int_to_array_array_Map MAP>> > ) > STORED AS ORC; > INSERT INTO array_demo values (0, array(1), array(array(2), array(3,4)), > map(5,array(array(6),array(7,8; > select * from array_demo; > +---+---+-++ > | array_demo.int_primitive | array_demo.int_array | > array_demo.int_array_array | array_demo.int_to_array_array_map | > +---+---+-++ > | 0 | [1] | [[2],[3,4]] > | {5:[[6],[7,8]]} | > +---+---+-++ > {code} > Converting to iceberg > > > {code:java} > ALTER TABLE array_demo SET TBLPROPERTIES > ('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler') > select * from array_demo; > INFO : Compiling > command(queryId=gfurnstahl_20220608102746_54bf3e74-e12b-400b-94a9-4e4c9fe460fe): > select * from array_demo > INFO : No Stats for default@array_demo, Columns: int_primitive, int_array, > int_to_array_array_map, int_array_array > INFO : Semantic Analysis Completed (retrial = false) > INFO : Created Hive schema: > Schema(fieldSchemas:[FieldSchema(name:array_demo.int_primitive, type:int, > comment:null), FieldSchema(name:array_demo.int_array, type:array, > comment:null), FieldSchema(name:array_demo.int_array_array, > type:array>, comment:null), > FieldSchema(name:array_demo.int_to_array_array_map, > type:map>>, comment:null)], properties:null) > INFO : Completed compiling > command(queryId=gfurnstahl_20220608102746_54bf3e74-e12b-400b-94a9-4e4c9fe460fe); > Time taken: 0.036 seconds > INFO : Executing > command(queryId=gfurnstahl_20220608102746_54bf3e74-e12b-400b-94a9-4e4c9fe460fe): > select * from array_demo > INFO : Completed executing > command(queryId=gfurnstahl_20220608102746_54bf3e74-e12b-400b-94a9-4e4c9fe460fe); > Time taken: 0.0 seconds > INFO : OK > Error: java.io.IOException: java.lang.IllegalArgumentException: Can not > promote MAP type to INTEGER (state=,code=0) > select int_primitive from array_demo; > ++ > | int_primitive | > ++ > | 0 | > ++ > 1 row selected (0.088 seconds) > {code} > Removing schema.name-mapping.default solves it > {code:java} > ALTER TABLE array_demo UNSET TBLPROPERTIES ('schema.name-mapping.default'); > select * from array_demo; > +---+---+-++ > | array_demo.int_primitive | array_demo.int_array | > array_demo.int_array_array | array_demo.int_to_array_array_map | > +---+---+-++ > | 0 | [1] | [[2],[3,4]] > | {5:[[6],[7,8]]} | > +---+---+-++ > {code} > Possible cause: > > The name mapping generated and pushed into schema.name-mapping.default is > different from the name mapping in the schema in the metadata.json (attached > it) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26298) Selecting complex types on migrated iceberg table does not work
[ https://issues.apache.org/jira/browse/HIVE-26298?focusedWorklogId=780766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780766 ] ASF GitHub Bot logged work on HIVE-26298: - Author: ASF GitHub Bot Created on: 13/Jun/22 09:41 Start Date: 13/Jun/22 09:41 Worklog Time Spent: 10m Work Description: lcspinter opened a new pull request, #3361: URL: https://github.com/apache/hive/pull/3361 ### What changes were proposed in this pull request? Change iceberg field ID generation while converting hive schema to iceberg schema. ### Why are the changes needed? When converting hive schema to iceberg schema the field ID is generated as we traverse the schema tree in preorder. This can be an issue while converting multiple levels of complex types since iceberg expects that the IDs are generated using a custom tree traversal technique. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test Issue Time Tracking --- Worklog Id: (was: 780766) Remaining Estimate: 0h Time Spent: 10m > Selecting complex types on migrated iceberg table does not work > --- > > Key: HIVE-26298 > URL: https://issues.apache.org/jira/browse/HIVE-26298 > Project: Hive > Issue Type: Bug >Reporter: Gergely Fürnstáhl >Assignee: László Pintér >Priority: Major > Attachments: 1-a5d522f4-a065-44e6-983b-ba66596b4332.metadata.json > > Time Spent: 10m > Remaining Estimate: 0h > > I am working on implementing NameMapping in Impala (mainly replicating Hive's > functionality) and ran into the following issue: > {code:java} > CREATE TABLE array_demo > ( > int_primitive INT, > int_array ARRAY, > int_array_array ARRAY>, > int_to_array_array_Map MAP>> > ) > STORED AS ORC; > INSERT INTO array_demo values (0, array(1), array(array(2), array(3,4)), > map(5,array(array(6),array(7,8; > select * from array_demo; > +---+---+-++ > | array_demo.int_primitive | array_demo.int_array | > array_demo.int_array_array | array_demo.int_to_array_array_map | > +---+---+-++ > | 0 | [1] | [[2],[3,4]] > | {5:[[6],[7,8]]} | > +---+---+-++ > {code} > Converting to iceberg > > > {code:java} > ALTER TABLE array_demo SET TBLPROPERTIES > ('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler') > select * from array_demo; > INFO : Compiling > command(queryId=gfurnstahl_20220608102746_54bf3e74-e12b-400b-94a9-4e4c9fe460fe): > select * from array_demo > INFO : No Stats for default@array_demo, Columns: int_primitive, int_array, > int_to_array_array_map, int_array_array > INFO : Semantic Analysis Completed (retrial = false) > INFO : Created Hive schema: > Schema(fieldSchemas:[FieldSchema(name:array_demo.int_primitive, type:int, > comment:null), FieldSchema(name:array_demo.int_array, type:array, > comment:null), FieldSchema(name:array_demo.int_array_array, > type:array>, comment:null), > FieldSchema(name:array_demo.int_to_array_array_map, > type:map>>, comment:null)], properties:null) > INFO : Completed compiling > command(queryId=gfurnstahl_20220608102746_54bf3e74-e12b-400b-94a9-4e4c9fe460fe); > Time taken: 0.036 seconds > INFO : Executing > command(queryId=gfurnstahl_20220608102746_54bf3e74-e12b-400b-94a9-4e4c9fe460fe): > select * from array_demo > INFO : Completed executing > command(queryId=gfurnstahl_20220608102746_54bf3e74-e12b-400b-94a9-4e4c9fe460fe); > Time taken: 0.0 seconds > INFO : OK > Error: java.io.IOException: java.lang.IllegalArgumentException: Can not > promote MAP type to INTEGER (state=,code=0) > select int_primitive from array_demo; > ++ > | int_primitive | > ++ > | 0 | > ++ > 1 row selected (0.088 seconds) > {code} > Removing schema.name-mapping.default solves it > {code:java} > ALTER TABLE array_demo UNSET TBLPROPERTIES ('schema.name-mapping.default'); > select * from array_demo; > +---+---+-++ > | array_demo.int_primitive | array_demo.int_array | > array_demo.int_array_array | array_demo.int_to_array_array_map | >
[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=780762=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780762 ] ASF GitHub Bot logged work on HIVE-25980: - Author: ASF GitHub Bot Created on: 13/Jun/22 09:20 Start Date: 13/Jun/22 09:20 Worklog Time Spent: 10m Work Description: pvary commented on PR #3053: URL: https://github.com/apache/hive/pull/3053#issuecomment-1153681364 @cravani: Only a small nit, then we are ready. Thanks! Issue Time Tracking --- Worklog Id: (was: 780762) Time Spent: 8h (was: 7h 50m) > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 8h > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at > com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) > at > com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1331) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) > at >
[jira] [Assigned] (HIVE-26318) Select on migrated iceberg table fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-26318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Pintér reassigned HIVE-26318: > Select on migrated iceberg table fails with NPE > --- > > Key: HIVE-26318 > URL: https://issues.apache.org/jira/browse/HIVE-26318 > Project: Hive > Issue Type: Bug >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > > Enable vectorization: > {code:sql} > set hive.vectorized.execution.enabled=true; > {code} > Create a hive table with the following schema: > {code:sql} > CREATE EXTERNAL TABLE tbl_complex ( > a int, > arrayofprimitives array, > arrayofarrays array>, > arrayofmaps array>, > arrayofstructs array somewhere:string>>, > mapofprimitives map, > mapofarrays map>, > mapofmaps map>, > mapofstructs map somewhere:string>>, > structofprimitives struct, > structofarrays struct, birthdays:array>, > structofmaps struct, map2:map> > ) STORED AS PARQUET" {code} > Insert some data: > {code:sql} > INSERT INTO tbl_complex VALUES ( > 1, > array('a','b','c'), > array(array('a'), array('b', 'c')), > array(map('a','b'), map('e','f')), > array(named_struct('something', 'a', 'someone', 'b', 'somewhere', > 'c'), > named_struct('something', 'e', 'someone', 'f', 'somewhere', 'g')), > map('a', 'b'), > map('a', array('b','c')), > map('a', map('b','c')), > map('a', named_struct('something', 'b', 'someone', 'c', 'somewhere', > 'd')), > named_struct('something', 'a', 'somewhere', 'b'), > named_struct('names', array('a', 'b'), 'birthdays', array('c', 'd', > 'e')), > named_struct('map1', map('a', 'b'), 'map2', map('c', 'd')) > ) > {code} > Migrate the table to iceberg: > {code:sql} > ALTER TABLE tbl_complex SET TBLPROPERTIES > ('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'); > {code} > Run a simple query: > {code:sql} > SELECT * FROM tbl_complex ORDER BY a; > {code} > It will fail with: > {code:txt} > TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : > attempt_1655110825475_0001_3_00_00_1:java.lang.RuntimeException: > java.lang.RuntimeException: java.io.IOException: > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.RuntimeException: java.io.IOException: > java.lang.NullPointerException > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:200) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:139) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:105) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:164) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:83) > at > org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:706) > at > org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:665) > at >
[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable
[ https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=780758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780758 ] ASF GitHub Bot logged work on HIVE-25980: - Author: ASF GitHub Bot Created on: 13/Jun/22 09:10 Start Date: 13/Jun/22 09:10 Worklog Time Spent: 10m Work Description: pvary commented on code in PR #3053: URL: https://github.com/apache/hive/pull/3053#discussion_r895497824 ## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java: ## @@ -309,7 +309,37 @@ void checkTable(Table table, PartitionIterable parts, byte[] filterExp, CheckRes return; } -Set partPaths = new HashSet<>(); +// now check the table folder and see if we find anything +// that isn't in the metastore +Set allPartDirs = new HashSet<>(); +List partColumns = table.getPartitionKeys(); +checkPartitionDirs(tablePath, allPartDirs, Collections.unmodifiableList(getPartColNames(table))); +String tablePathStr = tablePath.toString(); +int tablePathLength = tablePathStr.length(); + +if (filterExp != null) { + PartitionExpressionProxy expressionProxy = createExpressionProxy(conf); + List partitions = new ArrayList<>(); + Set partDirs = new HashSet(); + boolean tablePathStrEndsWith = tablePathStr.endsWith("/"); + allPartDirs.stream().forEach(path -> { +if (tablePathStrEndsWith) { Review Comment: We can move this check out of the loop: ``` int tablePathStrLen = tablePathStr.endsWith("/") ? tablePathStr.length() : tablePathStr.length() + 1; allPartDirs.stream().forEach(path -> partitions.add(path.toString().substring(tablePathStrLen))); ``` Issue Time Tracking --- Worklog Id: (was: 780758) Time Spent: 7h 50m (was: 7h 40m) > Reduce fs calls in HiveMetaStoreChecker.checkTable > -- > > Key: HIVE-25980 > URL: https://issues.apache.org/jira/browse/HIVE-25980 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Chiran Ravani >Assignee: Chiran Ravani >Priority: Major > Labels: pull-request-available > Time Spent: 7h 50m > Remaining Estimate: 0h > > MSCK Repair table for high partition table can perform slow on Cloud Storage > such as S3, one of the case we found where slowness was observed in > HiveMetaStoreChecker.checkTable. > {code:java} > "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 > tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] >java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at > sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) > at > sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) > at > sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) > at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) > at > sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) > at >
[jira] [Assigned] (HIVE-26317) Select on iceberg table stored as parquet and vectorization enabled fails with Runtime Exception
[ https://issues.apache.org/jira/browse/HIVE-26317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Pintér reassigned HIVE-26317: > Select on iceberg table stored as parquet and vectorization enabled fails > with Runtime Exception > > > Key: HIVE-26317 > URL: https://issues.apache.org/jira/browse/HIVE-26317 > Project: Hive > Issue Type: Bug >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > > Create an iceberg table having the following schema: > {code:sql} > CREATE EXTERNAL tbl_complex TABLE (a int, arrayofarrays > array>) STORED BY ICEBERG STORED AS PARQUET > {code} > Insert some data into it: > {code:sql} > INSERT INTO tbl_complex VALUES (1, array(array('a'), array('b', 'c'))) > {code} > Turn on vectorization and run a simple query: > {code:sql} > set hive.vectorized.execution.enabled=true; > SELECT * FROM tbl_complex; > {code} > The query will fail with > {code:java} > Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, > Vertex vertex_1655109552551_0001_2_00 [Map 1] killed/failed due > to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:367) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:246) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:270) > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:281) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:545) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:513) > at > org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:271) > at > org.apache.iceberg.mr.hive.TestHiveShell.executeStatement(TestHiveShell.java:142) > ... 16 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, > vertexName=Map 1, vertexId=vertex_1655109552551_0001_2_00, diagnostics=[Task > failed, taskId=task_1655109552551_0001_2_00_00, diagnostics=[TaskAttempt > 0 failed, info=[Error: Error while running task ( failure ) : > attempt_1655109552551_0001_2_00_00_0:java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: > java.lang.RuntimeException: Unsupported type used in list:array > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) > at > com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) > at > com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: java.lang.RuntimeException: Unsupported type used in > list:array > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:89) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:414) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:293) > ... 16 more > Caused by: java.io.IOException: java.lang.RuntimeException: Unsupported type > used in list:array > at >
[jira] [Assigned] (HIVE-26316) Handle dangling open txns on both src & tgt in unplanned failover.
[ https://issues.apache.org/jira/browse/HIVE-26316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haymant Mangla reassigned HIVE-26316: - > Handle dangling open txns on both src & tgt in unplanned failover. > -- > > Key: HIVE-26316 > URL: https://issues.apache.org/jira/browse/HIVE-26316 > Project: Hive > Issue Type: Sub-task >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (HIVE-26165) Remove READ locks for ACID tables
[ https://issues.apache.org/jira/browse/HIVE-26165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko resolved HIVE-26165. --- Assignee: Denys Kuzmenko Resolution: Fixed > Remove READ locks for ACID tables > - > > Key: HIVE-26165 > URL: https://issues.apache.org/jira/browse/HIVE-26165 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Since operations that required an EXCLUSIVE lock were rewritten to READs > non-blocking, we do not need READ locks anymore. That should improve ACID > concurrency. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HIVE-26165) Remove READ locks for ACID tables
[ https://issues.apache.org/jira/browse/HIVE-26165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553411#comment-17553411 ] Denys Kuzmenko commented on HIVE-26165: --- Merged to master. [~klcopp], thank you for the review! > Remove READ locks for ACID tables > - > > Key: HIVE-26165 > URL: https://issues.apache.org/jira/browse/HIVE-26165 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Since operations that required an EXCLUSIVE lock were rewritten to READs > non-blocking, we do not need READ locks anymore. That should improve ACID > concurrency. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HIVE-26165) Remove READ locks for ACID tables
[ https://issues.apache.org/jira/browse/HIVE-26165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-26165: -- Summary: Remove READ locks for ACID tables (was: Remove READ locks for ACID tables with SoftDelete enabled) > Remove READ locks for ACID tables > - > > Key: HIVE-26165 > URL: https://issues.apache.org/jira/browse/HIVE-26165 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Since operations that required an EXCLUSIVE lock were rewritten to READs > non-blocking, we do not need READ locks anymore. That should improve ACID > concurrency. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Work logged] (HIVE-26165) Remove READ locks for ACID tables with SoftDelete enabled
[ https://issues.apache.org/jira/browse/HIVE-26165?focusedWorklogId=780723=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780723 ] ASF GitHub Bot logged work on HIVE-26165: - Author: ASF GitHub Bot Created on: 13/Jun/22 07:14 Start Date: 13/Jun/22 07:14 Worklog Time Spent: 10m Work Description: deniskuzZ merged PR #3235: URL: https://github.com/apache/hive/pull/3235 Issue Time Tracking --- Worklog Id: (was: 780723) Time Spent: 0.5h (was: 20m) > Remove READ locks for ACID tables with SoftDelete enabled > - > > Key: HIVE-26165 > URL: https://issues.apache.org/jira/browse/HIVE-26165 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Since operations that required an EXCLUSIVE lock were rewritten to READs > non-blocking, we do not need READ locks anymore. That should improve ACID > concurrency. -- This message was sent by Atlassian Jira (v8.20.7#820007)