Taras Bobrovytsky has posted comments on this change. Change subject: Preview: IMPALA-4467: Add support for CRUD operations in stress test ......................................................................
Patch Set 1: (23 comments) http://gerrit.cloudera.org:8080/#/c/5093/1/tests/stress/concurrent_select.py File tests/stress/concurrent_select.py: Line 680: # Query to run before worst_case_sql to get the worst case performance > What's worst_case_sql? Can you give a little more flavor to the comment? I' I updated the patch significantly (based on our discussion), this is no longer here. Line 688: self.options = {} > needs a comment Done Line 691: self.modifies_table = False > let's introduce a stmt_type enum that can have values like SELECT/INSERT/UP Done Line 694: self.MIN_INT = 10 > var names are not very telling, what are these numbers? This not relevant any more. Removed. PS1, Line 695: self.MIN_INT = 20 > You've defined self.MIN_INT twice. Yeah, like I said it was a rough patch. These numbers are not needed any more. PS1, Line 698: def sql(self): > This whole set of methods could use docstrings to explain to someone roughl This method was removed. PS1, Line 701: return str(randint(10, 20)) > I would like for us to make our randomized tests more repeatable. To that e After talking to Alex, I removed any randomness from the Query class. Now 1 query object always represents a single sql statement. PS1, Line 713: result = re.sub(pattern, "10", self._sql) > Was the hard coding of 10 here meant to actually be self.MIN_INT? Yes exactly, that was the intention. However, this is no longer relevant. Line 832: raise > seems like we intentionally did not raise before? this was a mistake, removed. Line 942: LOG.debug("Result set empty for query with id %s", > Another possible use of stmt_type No longer relevant. Line 1239: try: > we can use the stmt_type here to only run explain for those stmts that supp Done Line 1356: def clean_up_database(cursor): > reset_database? Done Line 1357: LOG.info("Cleaning up {0} database".format(cursor.db_name)) > needs comment Added method docstring. Line 1360: if not table.name.endswith("_original"): > seems simpler to move the modified tables into a new database, then you don I am not sure if it's simpler. It seems to me it's simpler to have everything in a single database. We can maybe discuss this offline. Line 1371: if len(table.primary_keys) < 1: > add comment: only run for Kudu tables which must have a primary key Done Line 1373: # For now we will not handle the case in a special way where several columns are a > Why? Seems easy enough to do. I am not really sure if it's necessary to handle this in some special way. Do you have any suggestions? Line 1376: if primary_key.exact_type not in (TinyInt, SmallInt, BigInt): > Why this restriction? We want to be able to apply the "modulo operation". Added comment. Added comment. PS1, Line 1379: query.modifies_table = False > Why is this False? It's relative to the original state. For example, if we run a query If we have an unmodified lineitem item table, then we run an upsert query on it, it should remain unchanged. Added comment in class. Line 1383: if table.name + "_original" in set(table.name for table in tables): > What's the motivation for this? We might be able to achieve the same thing As discussed in person, I agree that 1 Query object should represent 1 query. Fixed. PS1, Line 1392: def generate_delete_queries(cursor): > Lots of duplicated logic here and in generate_upsert_queries(). I think add Rewrote to eliminate code duplication. Line 1433: drop_query.sql = "DROP STATS {0}".format(table.name) > why drop? we're not checking the results anyway Done Line 1438: compute_query.name = "compute_stats_{0}".format(table.name) > can you add the mt_dop option as part of the name? Done PS1, Line 1664: if query._sql in queries_with_runtime_info_by_db_and_sql[query.db_name]: > It doesn't seem right to be examining a private attribute from the outside. Yeah, I agree, this was a rough patch. Fixed now. -- To view, visit http://gerrit.cloudera.org:8080/5093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Michael Brown <mi...@cloudera.com> Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com> Gerrit-HasComments: Yes