Hello Bharath Vissapragada, Quanlong Huang, Anurag Mantripragada, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13932 to look at the new patch set (#5). Change subject: IMPALA-8661 : Add randomized tests to stress MetastoreEventsProcessor ...................................................................... IMPALA-8661 : Add randomized tests to stress MetastoreEventsProcessor This change adds a new stress test for MetastoreEventsProcessor. This test randomly executes hive queries to generate a lot of events. The event processor is invoked at random intervals so that a variable batch of events is processed everytime. After each batch is processed, the test checks the status of events processor. By default, on CDH builds the test is configured to run with 8 concurrent Hive clients and each of the client runs 100 random Hive queries. These defaults can be overridden by passing system properties using maven command arguments "-DnumClients" and "-DnumQueriesPerClients". Additionally, the test also creates impala clients which keep issuing refresh table commands on the test databases to make sure that eventProcessor is doing some real work rather than invalidating/refreshing tables which are already incomplete. This test is added as a junit test and uses Hive JDBC to issue the sqls. This is much faster than the end-to-end python test which issues each hive query in a separate beeline sessions which re-establishes the connection every time. The test already found a bug which is caused when a Hive issues a alter table add if not exists partition" query and the partition is not really added since it is preexisting. In such a case the ADD_PARTITION events is still generated but with a empty list of partitions in the events. Such events are now ignored. Notes: 1. Ran the test with defaults. It generates about 2100 events and runs for close to 15 min. This can be changed to a lower value if we see significant increased delay in the test job runtimes. 3. On CDP builds the concurrent hive queries run very slow due to container provisioning time on the minicluster. I have left this as a TODO to investigate. The test runs in single threaded mode with increased number of queries when running against Hive-3 Change-Id: I8c85b83efd4f56b5ae0e8d1dc6a2ee2feb6721ce --- M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java A fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java A fe/src/test/java/org/apache/impala/testutil/HiveJdbcClientPool.java M fe/src/test/java/org/apache/impala/testutil/ImpalaJdbcClient.java M fe/src/test/java/org/apache/impala/testutil/TestUtils.java A fe/src/test/java/org/apache/impala/util/RandomHiveQueryRunner.java 8 files changed, 1,578 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/13932/5 -- To view, visit http://gerrit.cloudera.org:8080/13932 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c85b83efd4f56b5ae0e8d1dc6a2ee2feb6721ce Gerrit-Change-Number: 13932 Gerrit-PatchSet: 5 Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Anurag Mantripragada <anu...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>