Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/10116 )
Change subject: IMPALA-6131: Track time of last statistics update in metadata ...................................................................... Patch Set 10: (8 comments) http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1235 PS10, Line 1235: * TODO: the following paragraph seems at least partially obsolate > Why not clean it up then? Point (1) is obsolete but point (2) is still vali Done http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java File fe/src/main/java/org/apache/impala/catalog/KuduTable.java: http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@244 PS10, Line 244: Long.toString(System.currentTimeMillis() / 1000)); > Doesn't the HMS set this in alter_table()? It sets it if the property does not exist, but this would not work well for Kudu tables: after calling alter_table here, the table will not be loaded from HMS again (unlike in HDFS tables, where this is done to ensure HMS-Impala consistency). This means that if I would remove "transient_lastDdlTime" here, then HMS would calculate a valid value, but Impala catalogd would not know about this value. http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@785 PS10, Line 785: msTbl.putToParameters("impala.lastComputeStatsTime", > Create a constant for the table property in Table similar to what we have i I have put the constant to HdfsTable, because all the other property keys reside their, even those that are used by Kudu tables too. http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868 PS10, Line 2868: // HMS updates transient_lastDdlTime if it is removed. > I understand what this comment says, but I don't understand it's relevance This comment remained here by mistake. http://gerrit.cloudera.org:8080/#/c/10116/10/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2870 PS10, Line 2870: Long.toString(System.currentTimeMillis() / 1000)); > We're doing "System.currentTimeMillis() / 1000" in quite a few places, I su I have created a bit different helper function. http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py File tests/metadata/test_last_ddl_time_update.py: http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@74 PS10, Line 74: def run_test(self, query, expect_changed_ddl_time, expect_changed_stats_time): > Passing a ";" delimited string is really weird. Why not pass a list of quer The "invalidate metadata %(TBL)s; describe %(TBL)s" combo was used the check that in case of Kudu tables, lastDdlTime is not increased by reloading the table (it used to increase it), so an extra query is needed after INVALIDATE METADATA to load the table. The two can be separated (like for other init queries) but I felt that they belong together. I replaced the string splitting with variable argument list - I can replace this with two separate calls if needed. http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@93 PS10, Line 93: # Hive uses a seconds granularity on the last ddl time. > Not just Hive - we do too. Just say "All times are stored in seconds" or so Isn't it an HMS convention in this case? Or is there a reason behind not using higher precision timestamp? http://gerrit.cloudera.org:8080/#/c/10116/10/tests/metadata/test_last_ddl_time_update.py@155 PS10, Line 155: h.expect_no_time_change("drop incremental stats %(TBL)s partition (j=1, s='2012')") > use "drop stats" instead to wipe everything (including incremental stats) Are you sure about this? I use DROP INCREMENTAL STATS on purpose to check +1 statement's effect in the test suite. -- To view, visit http://gerrit.cloudera.org:8080/10116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5 Gerrit-Change-Number: 10116 Gerrit-PatchSet: 10 Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Fri, 11 May 2018 17:26:43 +0000 Gerrit-HasComments: Yes