[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres
[ https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517302#comment-17517302 ] mahesh kumar behera commented on HIVE-25540: [~zabetak] The batch update is tested in scale for mysql and Postgres backend only. > Enable batch update of column stats only for MySql and Postgres > > > Key: HIVE-25540 > URL: https://issues.apache.org/jira/browse/HIVE-25540 > Project: Hive > Issue Type: Sub-task >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > Time Spent: 50m > Remaining Estimate: 0h > > The batch updation of partition column stats using direct sql is tested only > for MySql and Postgres. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres
[ https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509772#comment-17509772 ] Stamatis Zampetakis commented on HIVE-25540: [~pvary] The changes in HIVE-26040 do seem reasonable for the problem I discovered, many thanks for the quick fix. However, I found this issue just by running one random qtest on a metastore using mssql so I cannot say with confidence that now we have sufficient test coverage for claiming that this feature (HIVE-25181) is production ready in *all* databases; I let [~maheshk114] answer this question. > Enable batch update of column stats only for MySql and Postgres > > > Key: HIVE-25540 > URL: https://issues.apache.org/jira/browse/HIVE-25540 > Project: Hive > Issue Type: Sub-task >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > Time Spent: 50m > Remaining Estimate: 0h > > The batch updation of partition column stats using direct sql is tested only > for MySql and Postgres. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres
[ https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507813#comment-17507813 ] Peter Vary commented on HIVE-25540: --- [~zabetak], [~maheshk114]: I have a fix for the issue above: See: HIVE-26040. Could you please check that successfully running the below check confirms that the fix is enough to close this jira?: {code} mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q -Dtest.metastore.db=mssql mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q -Dtest.metastore.db=oracle mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q -Dtest.metastore.db=postgres mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q -Dtest.metastore.db=derby mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q -Dtest.metastore.db=mysql {code} > Enable batch update of column stats only for MySql and Postgres > > > Key: HIVE-25540 > URL: https://issues.apache.org/jira/browse/HIVE-25540 > Project: Hive > Issue Type: Sub-task >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > Time Spent: 50m > Remaining Estimate: 0h > > The batch updation of partition column stats using direct sql is tested only > for MySql and Postgres. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres
[ https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507810#comment-17507810 ] Peter Vary commented on HIVE-25540: --- Created a Jira to fix the issue mentioned above: HIVE-26040 > Enable batch update of column stats only for MySql and Postgres > > > Key: HIVE-25540 > URL: https://issues.apache.org/jira/browse/HIVE-25540 > Project: Hive > Issue Type: Sub-task >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > Time Spent: 50m > Remaining Estimate: 0h > > The batch updation of partition column stats using direct sql is tested only > for MySql and Postgres. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres
[ https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17506975#comment-17506975 ] Stamatis Zampetakis commented on HIVE-25540: I think it would be good solve this JIRA before releasing 4.0.0-alpha-1 to avoid failures like the one outlined above. > Enable batch update of column stats only for MySql and Postgres > > > Key: HIVE-25540 > URL: https://issues.apache.org/jira/browse/HIVE-25540 > Project: Hive > Issue Type: Sub-task >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > Time Spent: 50m > Remaining Estimate: 0h > > The batch updation of partition column stats using direct sql is tested only > for MySql and Postgres. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres
[ https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17506973#comment-17506973 ] Stamatis Zampetakis commented on HIVE-25540: Today I was running a few tests (over commit https://github.com/apache/hive/commit/d696b34a5765fe950ebe4bfffd36b9ea914dfaab) with various kind of metastore backends (e.g., MicrosoftSQLServer) for another JIRA case and I bumped into a exceptions with directsql and updating statistics which I think are related/ can be solved by this JIRA. {code:bash} mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q -Dtest.metastore.db=mssql {code} {noformat} 2022-03-15T07:57:17,078 ERROR [2b933b88-6083-4750-b151-2d2c7e04ccce main] metastore.DirectSqlUpdateStat: Unable to getNextCSIdForMPartitionColumnStatistics com.microsoft.sqlserver.jdbc.SQLServerException: Line 1: FOR UPDATE clause allowed only for DECLARE CURSOR. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:258) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1535) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:845) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:752) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7151) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2478) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:219) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:199) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQuery(SQLServerStatement.java:654) ~[mssql-jdbc-6.2.1.jre8.jar:?] at com.zaxxer.hikari.pool.ProxyStatement.executeQuery(ProxyStatement.java:108) ~[HikariCP-2.6.1.jar:?] at com.zaxxer.hikari.pool.HikariProxyStatement.executeQuery(HikariProxyStatement.java) ~[HikariCP-2.6.1.jar:?] at org.apache.hadoop.hive.metastore.DirectSqlUpdateStat.getNextCSIdForMPartitionColumnStatistics(DirectSqlUpdateStat.java:676) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.updatePartitionColumnStatisticsBatch(MetaStoreDirectSql.java:2966) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatisticsInBatch(ObjectStore.java:9849) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_261] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_261] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_261] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_261] at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at com.sun.proxy.$Proxy60.updatePartitionColumnStatisticsInBatch(Unknown Source) [?:?] at org.apache.hadoop.hive.metastore.HMSHandler.updatePartitionColStatsForOneBatch(HMSHandler.java:7060) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HMSHandler.updatePartitionColStatsInBatch(HMSHandler.java:7113) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HMSHandler.set_aggr_stats_for(HMSHandler.java:9137) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_261] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_261] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_261] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_261] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:146) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at com.sun.proxy.$Proxy61.set_aggr_st
[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres
[ https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459241#comment-17459241 ] mahesh kumar behera commented on HIVE-25540: [~zabetak] The batch update uses direct SQL to optimize the number of backend database calls. Some of the SQL used are not supported by Oracle. So we need to put a check to go via DN if the backend DB is Oracle. Currently we have tested only in Mysql and Postgres. Batch update feature is not yet shipped. > Enable batch update of column stats only for MySql and Postgres > > > Key: HIVE-25540 > URL: https://issues.apache.org/jira/browse/HIVE-25540 > Project: Hive > Issue Type: Sub-task >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > The batch updation of partition column stats using direct sql is tested only > for MySql and Postgres. -- This message was sent by Atlassian Jira (v8.20.1#820001)