Thomas Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10559


Change subject: IMPALA-7061: Rework HBase splitting and assignment
......................................................................

IMPALA-7061: Rework HBase splitting and assignment

Some frontend PlannerTests rely on HBase tables being
arranged in a deterministic way. Specifically, the
HBase tables need to be split with specific region
boundaries and those regions need to be assigned to
specific HBase region servers.

Currently, the tables are created without splits and
testdata/bin/split-hbase.sh runs Java code in
HBaseTestDataRegionAssignment to split and assign
the tables. This runs during dataload via
testdata/bin/create-load-data.sh and during tests
with bin/run-all-tests.sh. There are problems with
both parts of this process. The table splitting is
flaky. Since significant time can pass between the
assignments and the tests, rebalancing means the
assignments are not always stable.

This changes the process so that the HBase tables are
created with the splits already specified via the
HBase shell. The splits remain stable over time.
PlannerTestBase runs the assignment code in
HBaseTestDataRegionAssignment at the start of
the PlannerTests. This makes the assignments
deterministic. No other tests depends on the
exact assignments, so this does not regress anything.

Testing:
 - Local testing
 - Ran gerrit-verify-dryrun-external

2.x does not have minicluster profiles, so the
HBaseTestDataRegionAssignment.java is in the regular
fe/src/test directories.

Change-Id: I62ce82b23a2df0faea36baadfd48b99b8f973f72
---
M bin/run-all-tests.sh
A 
fe/src/test/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssignment.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/bin/create-load-data.sh
M testdata/bin/generate-schema-statements.py
D testdata/bin/split-hbase.sh
M testdata/datasets/functional/functional_schema_template.sql
D 
testdata/src/main/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssigment.java
8 files changed, 165 insertions(+), 390 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/10559/1
--
To view, visit http://gerrit.cloudera.org:8080/10559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I62ce82b23a2df0faea36baadfd48b99b8f973f72
Gerrit-Change-Number: 10559
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall <thomasmarsh...@cmu.edu>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>

Reply via email to