Thomas Marshall has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10545 )
Change subject: IMPALA-7061: Rework HBase splitting and assignment ...................................................................... IMPALA-7061: Rework HBase splitting and assignment Some frontend PlannerTests rely on HBase tables being arranged in a deterministic way. Specifically, the HBase tables need to be split with specific region boundaries and those regions need to be assigned to specific HBase region servers. Currently, the tables are created without splits and testdata/bin/split-hbase.sh runs Java code in HBaseTestDataRegionAssignment to split and assign the tables. This runs during dataload via testdata/bin/create-load-data.sh and during tests with bin/run-all-tests.sh. There are problems with both parts of this process. The table splitting is flaky. Since significant time can pass between the assignments and the tests, rebalancing means the assignments are not always stable. This changes the process so that the HBase tables are created with the splits already specified via the HBase shell. The splits remain stable over time. PlannerTestBase runs the assignment code in HBaseTestDataRegionAssignment at the start of the PlannerTests. This makes the assignments deterministic. No other tests depends on the exact assignments, so this does not regress anything. Testing: - Local testing - Ran gerrit-verify-dryrun-external 2.x does not have minicluster profiles, so the HBaseTestDataRegionAssignment.java is in the regular fe/src/test directories. Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897 Reviewed-on: http://gerrit.cloudera.org:8080/10447 Reviewed-by: Philip Zeyliger <phi...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> (cherry picked from commit 9a5410570e25431813b96e00f7b91db44f672f38) Reviewed-on: http://gerrit.cloudera.org:8080/10545 Reviewed-by: Joe McDonnell <joemcdonn...@cloudera.com> Tested-by: Thomas Marshall <thomasmarsh...@cmu.edu> --- M bin/run-all-tests.sh A fe/src/test/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssignment.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java M testdata/bin/create-load-data.sh M testdata/bin/generate-schema-statements.py D testdata/bin/split-hbase.sh M testdata/datasets/functional/functional_schema_template.sql D testdata/src/main/java/org/apache/impala/datagenerator/HBaseTestDataRegionAssigment.java 8 files changed, 165 insertions(+), 390 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Thomas Marshall: Verified -- To view, visit http://gerrit.cloudera.org:8080/10545 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: 2.x Gerrit-MessageType: merged Gerrit-Change-Id: I3d639128a856254a6ccb93d6750f531974b5f897 Gerrit-Change-Number: 10545 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com> Gerrit-Reviewer: Thomas Marshall <thomasmarsh...@cmu.edu>