Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18191 )
Change subject: IMPALA-9433: Improved caching of HdfsFileHandles ...................................................................... Patch Set 21: (3 comments) http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.h File be/src/runtime/io/handle-cache.h: http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.h@118 PS21, Line 118: return file_handle.mtime() == mtime; The semantics of mtime is unclear to me - why do we handle it separately and not as the part of the key? My understanding is that we always do lookups with exact file name and mtime and they remain constant for the lifetime of the handle. http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h File be/src/util/lru-multi-cache.h: http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h@78 PS21, Line 78: /// this is a intrusive (as known as not owning) list, used for eviction nit: an http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h File be/src/util/lru-multi-cache.inline.h: http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@197 PS21, Line 197: p_value_internal->timestamp_seconds = MonotonicSeconds() I think that we didn't refresh timestamp_second in the old version, so sooner or later we closed handles, even if it was used again and again. This can influence what happens if the file was deleted - keeping a file handle always open means the replica on the given data node cannot be deleted. https://www.quora.com/What-happens-if-you-attempt-to-delete-a-file-in-HDFS-while-its-being-used-for-a-job -- To view, visit http://gerrit.cloudera.org:8080/18191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a Gerrit-Change-Number: 18191 Gerrit-PatchSet: 21 Gerrit-Owner: Gergely Fürnstáhl <gfurnst...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Thu, 17 Feb 2022 19:03:12 +0000 Gerrit-HasComments: Yes