Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5162

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
......................................................................

IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

This patch adds support to the random query generator infrastructure to
model and write SQL INSERTs. It does not actually randomly generate
INSERTs at this time (tracked in IMPALA-4353 and umbrella task
IMPALA-3740) but does provide necessary building blocks to do so.

First, it's necessary to model the INSERTs as part of our data model.
This was done by taking the current notion of a Query and making it a
SelectQuery. We also then create an abstract Query containing some of
the more common methods and attributes. We then model an INSERT query,
INSERT clause, and VALUES clause (IMPALA-4343).

Second, it's necessary to test the basics of this data model. It made
sense to go ahead and implement the necessary SqlWriter methods to write
the SQL for these clauses (IMPALA-4354).

I could then use this writer with some existing and new tests that take
a query written into our data model and write the SQL, verifying they're
correct.

For INSERT into Kudu tables, the equivalent PostgreSQL queries need to
use "ON CONFLICT DO NOTHING", so all existing and new query tests verify
they can be written as PostgreSQL as well.

When last doing an end to end test of the changes, I encounterd a
problem I hadn't anticipated (but should have): the Leopard framework no
longer had the ability to unpickle query objects, because the name had
changed. I found a solution on the Python wiki here

https://wiki.python.org/moin/UsingPickle/RenamingModules

and adapted it to my needs.

Last, I made some flake8 adjustments in a few files where I also changed
the pickling.

Testing:
- all the query generator tests pass
- I can run Leopard front_end.py and load older query generator reports,
  browse them, and re-run failed queries
- I can run Leopard controller.py to actually do a query generator
  run
- discrepancy_searcher.py --explain-only ran for hundreds of queries.
  There were no problems writing the SELECT queries

Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
---
M tests/comparison/leopard/controller.py
A tests/comparison/leopard/custom_pickle.py
M tests/comparison/leopard/front_end.py
M tests/comparison/leopard/job.py
M tests/comparison/leopard/report.py
M tests/comparison/leopard/schedule_item.py
M tests/comparison/model_translator.py
M tests/comparison/query.py
M tests/comparison/query_flattener.py
M tests/comparison/query_generator.py
M tests/comparison/tests/fake_query.py
M tests/comparison/tests/query_object_testdata.py
M tests/comparison/tests/test_query_objects.py
13 files changed, 711 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/5162/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>

Reply via email to