Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9697 )
Change subject: IMPALA-6670: refresh lib-cache entries from plan (WIP) ...................................................................... Patch Set 5: (8 comments) I'm not very familiar with the BE libcache, still looking at at. http://gerrit.cloudera.org:8080/#/c/9697/5/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/9697/5/be/src/codegen/llvm-codegen.h@640 PS5, Line 640: /// ensure that the cached hdfs_file that's used is the most recent. We should document the behavior of special arg values like -1 or LONG_MAX http://gerrit.cloudera.org:8080/#/c/9697/5/be/src/codegen/llvm-codegen.h@642 PS5, Line 642: const std::string& hdfs_file, const time_t last_modified_time); Why not time_t mtime? I think that's equally expressive since mtime is a well-known term. http://gerrit.cloudera.org:8080/#/c/9697/5/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/9697/5/be/src/codegen/llvm-codegen.cc@801 PS5, Line 801: time_t mtime = Is it valid for the last_modified_time to not be set here? http://gerrit.cloudera.org:8080/#/c/9697/5/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/9697/5/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@606 PS5, Line 606: fn_.UpdateLastModifiedTime(); Would be good to have a FE unit test: * built-in agg/analytic/scalar functions should not have this set * user-defined agg and scalar functions should have this set Why are we doing this in analysis and not in toThrift()? An expr may be analyzed many times. http://gerrit.cloudera.org:8080/#/c/9697/5/fe/src/main/java/org/apache/impala/catalog/Function.java File fe/src/main/java/org/apache/impala/catalog/Function.java: http://gerrit.cloudera.org:8080/#/c/9697/5/fe/src/main/java/org/apache/impala/catalog/Function.java@178 PS5, Line 178: protected void setLastModifiedTime(long t) { lastModifiedTime_ = t; } I don't think we should modify CatalogObjects during query analysis in impalads because those are shared among queries. We can have several queries analyzing the same expression, so getting/setting the mtime will be racy. Instead, I think we should get and set the mtime in Expr.treeToThriftHelper() There's still a chance of intra-query inconsistency, but at least there can't be a race between different queries. http://gerrit.cloudera.org:8080/#/c/9697/5/fe/src/main/java/org/apache/impala/catalog/Function.java@397 PS5, Line 397: public final void UpdateLastModifiedTime() { confusing collection of function names that need cleanup: getLastModifiedTime() setLastModifiedTime() UpdateLastModifiedTime() GetLastModifiedTime() http://gerrit.cloudera.org:8080/#/c/9697/5/tests/custom_cluster/test_coordinators.py File tests/custom_cluster/test_coordinators.py: http://gerrit.cloudera.org:8080/#/c/9697/5/tests/custom_cluster/test_coordinators.py@133 PS5, Line 133: db_name = 'TEST_EXEC_ONLY_CACHE' use unique_database to have it cleaned up automatically http://gerrit.cloudera.org:8080/#/c/9697/5/tests/custom_cluster/test_coordinators.py@203 PS5, Line 203: self.execute_query_expect_success(client, might never happen if the test fails, use unique_database -- To view, visit http://gerrit.cloudera.org:8080/9697 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icf740ea8c6a47e671427d30b4d139cb8507b7ff6 Gerrit-Change-Number: 9697 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Tue, 20 Mar 2018 05:10:32 +0000 Gerrit-HasComments: Yes