Matthew Jacobs has uploaded a new change for review. http://gerrit.cloudera.org:8080/4175
Change subject: IMPALA-3718: Support subset of functional-query for Kudu ...................................................................... IMPALA-3718: Support subset of functional-query for Kudu Adds initial support for the functional-query test workload for Kudu tables. There are a few issues that make loading the functional schema difficult on Kudu: 1) Kudu tables must have one or more columns that together constitute a unique primary key. a) Primary key columns must currently be the first columns in the table definition (KUDU-1271). b) Primary key columns cannot be nullable (KUDU-1570). 2) Kudu tables must be specified with distribution parameters. (1) limits the tables that can be loaded without ugly workarounds. This patch only includes important tables that are used for relevant tests, most notably the alltypes* family. In particular, alltypesagg is important but it does not have a set of columns that are non-nullable and form a unique primary key. As a result, that table is created in Kudu with a different name and an additional BIGINT column for a PK that is a unique index and is generated at data loading time using the ROW_NUMBER analytic function. A view is then wrapped around the underlying table that matches the alltypesagg schema exactly. When KUDU-1570 is resolved, this can be simplified. (2) requires some additional considerations and custom syntax. As a result, the DDL to create the tables is explicitly specified in CREATE_KUDU sections in the functional_schema_constraints.csv, and an additional DEPENDENT_LOAD_KUDU section was added to specify custom data loading DML that differs from the existing DEPENDENT_LOAD. TODO: IMPALA-4005: generate_schema_statements.py needs refactoring Tests that are not relevant or not yet supported have been marked with xfail and a message where appropriate. TODO: Support remaining functional tables/tests when possible. Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc --- M bin/impala-config.sh M testdata/bin/generate-schema-statements.py M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/functional-query_core.csv M testdata/workloads/functional-query/functional-query_exhaustive.csv M testdata/workloads/functional-query/functional-query_pairwise.csv M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/create_kudu.test M tests/common/skip.py M tests/common/test_result_verifier.py M tests/query_test/test_aggregation.py M tests/query_test/test_exprs.py M tests/query_test/test_queries.py M tests/query_test/test_scanners.py M tests/query_test/test_tpcds_queries.py 16 files changed, 281 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/4175/1 -- To view, visit http://gerrit.cloudera.org:8080/4175 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs <m...@cloudera.com>