[ https://issues.apache.org/jira/browse/IMPALA-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang updated IMPALA-10563: ------------------------------------ Issue Type: Bug (was: Test) > Stress insert tests timeout on GCS > ---------------------------------- > > Key: IMPALA-10563 > URL: https://issues.apache.org/jira/browse/IMPALA-10563 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure > Reporter: Quanlong Huang > Assignee: Quanlong Huang > Priority: Major > > The following tests failed with TimeoutError when running on GCS: > {code:java} > F > stress/test_acid_stress.py::TestConcurrentAcidInserts::()::test_concurrent_inserts[unique_database0] > stress/test_acid_stress.py:255: in test_concurrent_inserts > run_tasks(writers + checkers) > stress/stress_util.py:43: in run_tasks > pool.map_async(Task.run, tasks).get(timeout_seconds) > > ../toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:568: > in get > raise TimeoutError > E TimeoutError > F > stress/test_acid_stress.py::TestFailingAcidInserts::()::test_failing_inserts[unique_database0] > stress/test_acid_stress.py:333: in test_failing_inserts > self._run_test_failing_inserts(unique_database, is_partitioned) > stress/test_acid_stress.py:324: in _run_test_failing_inserts > run_tasks(writers + checkers) > stress/stress_util.py:43: in run_tasks > pool.map_async(Task.run, tasks).get(timeout_seconds) > > ../toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:568: > in get > raise TimeoutError > E TimeoutError > F > stress/test_insert_stress.py::TestInsertStress::()::test_inserts[unique_database0] > stress/test_insert_stress.py:102: in test_inserts > run_tasks(writers + checkers) > stress/stress_util.py:43: in run_tasks > pool.map_async(Task.run, tasks).get(timeout_seconds) > > ../toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/multiprocessing/pool.py:568: > in get > raise TimeoutError > E TimeoutError > {code} > It seems the statements take too much time in waiting for HMS. Thus resulting > in the whole slow down. Found some error logs in HMS's log: > {code} > 2021-03-08T01:29:10,479 INFO [pool-6-thread-181] txn.TxnHandler: Allocated > writeId: 1 for txnId: 3667 > 2021-03-08T01:29:10,492 INFO [pool-6-thread-177] txn.TxnHandler: Added > entries to MIN_HISTORY_LEVEL with a single query for current txn: [3669] > 2021-03-08T01:29:10,493 INFO [pool-6-thread-181] txn.TxnHandler: Allocated > write ids for dbName=test_concurrent_inserts_8933345c, > tblName=test_concurrent_inserts (txnIds: [3667]) > 2021-03-08T01:29:10,502 INFO [pool-6-thread-195] txn.TxnHandler: Added > entries to MIN_HISTORY_LEVEL with a single query for current txn: [3670] > 2021-03-08T01:29:10,507 ERROR [pool-6-thread-197] txn.TxnHandler: Exception > during write ids allocation for > request=AllocateTableWriteIdsRequest(dbName:test_concurrent_inserts_8933345c, > tableName:test_concurrent_inserts, txnIds:[3668]). Will retry if possible. > org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique > constraint "NEXT_WRITE_ID_IDX" > Detail: Key ("NWI_DATABASE", > "NWI_TABLE")=(test_concurrent_inserts_8933345c, test_concurrent_inserts) > already exists. > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448) > ~[postgresql-42.2.14.jar:42.2.14] > at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153) > ~[postgresql-42.2.14.jar:42.2.14] > at > org.postgresql.jdbc.PgPreparedStatement.execute(PgPreparedStatement.java:142) > ~[postgresql-42.2.14.jar:42.2.14] > at > com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44) > ~[HikariCP-2.6.1.jar:?] > at > com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java) > ~[HikariCP-2.6.1.jar:?] > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.allocateTableWriteIds(TxnHandler.java:1820) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8370) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_282] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_282] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_282] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at com.sun.proxy.$Proxy28.allocate_table_write_ids(Unknown Source) > [?:?] > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:20126) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:20110) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at java.security.AccessController.doPrivileged(Native Method) > [?:1.8.0_282] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_282] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > [hadoop-common-3.1.1.7.2.7.0-44.jar:?] > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > [hive-exec-3.1.3000.7.2.7.0-44.jar:3.1.3000.7.2.7.0-44] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_282] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_282] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282] > 2021-03-08T01:29:10,513 WARN [pool-6-thread-197] txn.TxnHandler: Retryable > error detected in > allocateTableWriteIds(AllocateTableWriteIdsRequest(dbName:test_concurrent_inserts_8933345c, > tableName:test_concurrent_inserts, txnIds:[3668])). Will wait 2000ms and > retry up to 10 times. Error: ERROR: duplicate key value violates unique > constraint "NEXT_WRITE_ID_IDX" > Detail: Key ("NWI_DATABASE", > "NWI_TABLE")=(test_concurrent_inserts_8933345c, test_concurrent_inserts) > already exists. > 2021-03-08T01:29:10,554 INFO [pool-6-thread-177] txn.TxnHandler: Allocated > writeId: 2 for txnId: 3669 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org