Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12987 )
Change subject: IMPALA-8341: Data cache for remote reads ...................................................................... Patch Set 5: (17 comments) http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache-test.cc File be/src/runtime/io/data-cache-test.cc: http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache-test.cc@273 PS5, Line 273: FLAGS_data_cache_file_max_size = 1024 * 1024; I just found out we have ScopedFlagSetter in scoped-flag-setter.h, I think it fits here and in the other tests. http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.h File be/src/runtime/io/data-cache.h: http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.h@215 PS5, Line 215: too_many_files 'start_reclaim'? http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.h@337 PS5, Line 337: std::unique_ptr<ThreadPool<int>> file_deleter_pool_; Can you mention in the comment that the pool has only 1 thread and why you're using a pool? I think it's because the pool makes handling the thread's lifetime easier, but I'm not sure that's correct. http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.h@341 PS5, Line 341: void CloseOldFiles(uint32_t thread_id, int partition_idx); Some functions around deleting files are called "Close...". We should point out in the comments somewhere that closing now also deletes. We could also rename the thread pool to file_closing_pool or rename the methods to "DeleteOldFiles" for consistency. I think I prefer the latter, since deletion implies closing, but the contraposition is not obvious. http://gerrit.cloudera.org:8080/#/c/12987/4/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/12987/4/be/src/runtime/io/data-cache.cc@72 PS4, Line 72: "(Advanced) Enable checksumming for the cached buffer."); > This is actually a static class member of DataCache. Sry for missing that. http://gerrit.cloudera.org:8080/#/c/12987/4/be/src/runtime/io/data-cache.cc@187 PS4, Line 187: inline > Not sure which one you are referring to ? Isn't it in #include "common/name Yeah, I think we commonly omit the explicit include for vector http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc File be/src/runtime/io/data-cache.cc: http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@95 PS5, Line 95: file deleter thread switch to single thread, or mention pool here http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@112 PS5, Line 112: RetireFile Can we call this DeleteFile? Otherwise there's a third thing to keep track of (Close, Delete, Retire) and the differences are subtle. I feel it's clear enough that DeleteFile would make sure it's closed. http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@125 PS5, Line 125: percpu_rwlock It's not obvious to me why we only need a percpu_rwlock here. Can you add a comment? http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@208 PS5, Line 208: holes nit: singular http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@335 PS5, Line 335: CloseAndVerifyFileSizes Similar to other comments, I'd call this "VerifySizeAndDeleteFiles", I think that captures well what's going on and the caller can expect the files to get closed. I don't feel strongly about that one though. http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@395 PS5, Line 395: meta_cache_->Erase(key); Will this handle hole punching through the eviction logic? http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@436 PS5, Line 436: VLOG(2) << Substitute("Storing file $0 offset $1 len $2 checksum $3 ", nit: only append the "checksum $3" part if checksumming is enabled? I don't feel strongly about it though. http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@457 PS5, Line 457: too_many_files start_reclaim? http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/data-cache.cc@633 PS5, Line 633: too_many_files start_reclaim? http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/hdfs-file-reader.cc File be/src/runtime/io/hdfs-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/12987/5/be/src/runtime/io/hdfs-file-reader.cc@37 PS5, Line 37: nit: trailing space http://gerrit.cloudera.org:8080/#/c/12987/5/tests/custom_cluster/test_data_cache.py File tests/custom_cluster/test_data_cache.py: http://gerrit.cloudera.org:8080/#/c/12987/5/tests/custom_cluster/test_data_cache.py@23 PS5, Line 23: cache hit and miss counts : in the runtime profile are as expected. It actually seems to check the metrics, not the profile counters. -- To view, visit http://gerrit.cloudera.org:8080/12987 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I734803c1c1787c858dc3ffa0a2c0e33e77b12edc Gerrit-Change-Number: 12987 Gerrit-PatchSet: 5 Gerrit-Owner: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: David Rorke <dro...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Thomas Marshall <tmarsh...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Mon, 29 Apr 2019 21:39:59 +0000 Gerrit-HasComments: Yes