Alex Behm has posted comments on this change. Change subject: Preview: IMPALA-4467: Add support for CRUD operations in stress test ......................................................................
Patch Set 1: (16 comments) Looks like it's going in the right direction. Most comments are questions. Might be good to outline the flow of a new Kudu stress run so it's easier to follow for reviewers. Maybe in a top-level comment inside concurrent_select.py http://gerrit.cloudera.org:8080/#/c/5093/1/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: Line 680: # Query to run before worst_case_sql to get the worst case performance What's worst_case_sql? Can you give a little more flavor to the comment? I'm not really sure why this setup query is needed. Also consider that the setup query itself may fail/crash which might complicate things. Line 688: self.options = {} needs a comment Line 691: self.modifies_table = False let's introduce a stmt_type enum that can have values like SELECT/INSERT/UPSERT/DELETE/COMPUTE_STATS etc. Line 694: self.MIN_INT = 10 var names are not very telling, what are these numbers? Line 832: raise seems like we intentionally did not raise before? Line 942: LOG.debug("Result set empty for query with id %s", Another possible use of stmt_type Line 1239: try: we can use the stmt_type here to only run explain for those stmts that support it (i.e. anything except compute stats). Line 1356: def clean_up_database(cursor): reset_database? "clean up" sounds like DROP DATABASE CASCADE Line 1357: LOG.info("Cleaning up {0} database".format(cursor.db_name)) needs comment Line 1360: if not table.name.endswith("_original"): seems simpler to move the modified tables into a new database, then you don't need to check "_original" everywhere Line 1371: if len(table.primary_keys) < 1: add comment: only run for Kudu tables which must have a primary key Line 1373: # For now we will not handle the case in a special way where several columns are a Why? Seems easy enough to do. Line 1376: if primary_key.exact_type not in (TinyInt, SmallInt, BigInt): Why this restriction? Line 1383: if table.name + "_original" in set(table.name for table in tables): What's the motivation for this? We might be able to achieve the same thing in a different way. Having a query represent two queries seems somewhat abstraction breaking. Also, for UPSERT it's ok of the keys already exist. I think wen to explicitly also test the path where PKs already exist. Line 1433: drop_query.sql = "DROP STATS {0}".format(table.name) why drop? we're not checking the results anyway Line 1438: compute_query.name = "compute_stats_{0}".format(table.name) can you add the mt_dop option as part of the name? -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Michael Brown <mi...@cloudera.com> Gerrit-HasComments: Yes