Hello Bharath Vissapragada, Quanlong Huang, Anurag Mantripragada, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13932

to look at the new patch set (#5).

Change subject: IMPALA-8661 : Add randomized tests to stress 
MetastoreEventsProcessor
......................................................................

IMPALA-8661 : Add randomized tests to stress MetastoreEventsProcessor

This change adds a new stress test for MetastoreEventsProcessor. This
test randomly executes hive queries to generate a lot of events. The
event processor is invoked at random intervals so that a variable batch
of events is processed everytime. After each batch is processed, the
test checks the status of events processor. By default, on CDH builds
the test is configured to run with 8 concurrent Hive clients and each
of the client runs 100 random Hive queries. These defaults can be
overridden by passing system properties using maven command arguments
"-DnumClients" and "-DnumQueriesPerClients". Additionally, the test
also creates impala clients which keep issuing refresh table commands
on the test databases to make sure that eventProcessor is doing some
real work rather than invalidating/refreshing tables which are
already incomplete.

This test is added as a junit test and uses Hive JDBC to issue the sqls.
This is much faster than the end-to-end python test which issues each
hive query in a separate beeline sessions which re-establishes the
connection every time.

The test already found a bug which is caused when a Hive issues a alter
table add if not exists partition" query and the partition is not really
added since it is preexisting. In such a case the ADD_PARTITION events
is still generated but with a empty list of partitions in the events.
Such events are now ignored.

Notes:
1. Ran the test with defaults. It generates about 2100 events
and runs for close to 15 min. This can be changed to a lower
value if we see significant increased delay in the test job runtimes.
3. On CDP builds the concurrent hive queries run very slow due to
container provisioning time on the minicluster. I have left this as a
TODO to investigate. The test runs in single threaded mode with
increased number of queries when running against Hive-3

Change-Id: I8c85b83efd4f56b5ae0e8d1dc6a2ee2feb6721ce
---
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
A 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
A fe/src/test/java/org/apache/impala/testutil/HiveJdbcClientPool.java
M fe/src/test/java/org/apache/impala/testutil/ImpalaJdbcClient.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
A fe/src/test/java/org/apache/impala/util/RandomHiveQueryRunner.java
8 files changed, 1,578 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/13932/5
-- 
To view, visit http://gerrit.cloudera.org:8080/13932
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c85b83efd4f56b5ae0e8d1dc6a2ee2feb6721ce
Gerrit-Change-Number: 13932
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <anu...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>

Reply via email to