Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-3739: Enable stress tests on Kudu
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/4327/1//COMMIT_MSG
Commit Message:

PS1, Line 13: D
> ds
Done


http://gerrit.cloudera.org:8080/#/c/4327/1/testdata/bin/load-tpc-kudu.py
File testdata/bin/load-tpc-kudu.py:

PS1, Line 50: with
> IIRC this syntax breaks on py 2.4, which we shouldn't be using for these te
Hm, I've seen other scripts (e.g. load_nested.py)  already using the same 
syntax. Maybe Michael has a recommendation here.


PS1, Line 96: 'tpch', 'tpcds', 'TPCDS', 'TPCH'
> are both cases necessary?
I just added it for usability in case someone decides to specify the workload 
in upper case. Removed.


PS1, Line 100:   parser.add_argument("-b", "--buckets", default="9",
             :       help="Number of buckets to partition Kudu tables (only for 
hash-based).")
> Seems fine for now, but maybe we could have #buckets as a multiple of the #
Left a TODO for now, so we can revisit later depending on how we can to test 
this.


http://gerrit.cloudera.org:8080/#/c/4327/1/testdata/datasets/tpcds/tpcds_kudu_template.sql
File testdata/datasets/tpcds/tpcds_kudu_template.sql:

Line 1: ---- Template SQL statements to create and load TPCDS tables in
> can you explain a bit about how you picked the PKs? While we probably need 
Good points. In general, I followed the spec in setting the PK columns. Added a 
TODO to have two different variables for buckets one for fact and one for 
dimension tables.


PS1, Line 2: KUDU.
> prev line
Done


http://gerrit.cloudera.org:8080/#/c/4327/1/testdata/datasets/tpch/tpch_kudu_template.sql
File testdata/datasets/tpch/tpch_kudu_template.sql:

Line 1: ---- Template SQL statements to create and load TPCH tables in
> remove the tpch tables in tpch_schema_template.sql?
Added a TODO to do this in a follow up patch.


PS1, Line 2: KUDU
> prev line
Done


http://gerrit.cloudera.org:8080/#/c/4327/1/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

PS1, Line 900: engine=''
> I wasn't sure what engine meant until I looked at the usage. I'm wondering 
Yeah, I over-generalized this one. Changed it to something more explicit. Done


PS1, Line 1382:   if not args.tpcds_db and not args.tpch_db and not 
args.random_db \
              :       and not args.tpch_nested_db and not args.tpch_kudu_db \
              :       and not args.tpcds_kudu_db and not args.query_file_path:
              :     raise Exception("At least one of --tpcds-db, --tpch-db, 
--tpch-kudu-db,"
              :         "--tpcds-kudu-db, --tpch-nested-db, --random-db, 
--query-file-path is required")
> Hmm cumbersome... Maybe someone with more python experience knows a better 
Hm, maybe Michael has a suggestion here.


-- 
To view, visit http://gerrit.cloudera.org:8080/4327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to