Gergely Fürnstáhl has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18191 )

Change subject: IMPALA-9433: Improved caching of HdfsFileHandles
......................................................................


Patch Set 21:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.inline.h
File be/src/runtime/io/handle-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.inline.h@167
PS21, Line 167:   // Opening a file handle requires talking to the NameNode so 
it can take some time.
              :   RETURN_IF_ERROR(accessor_tmp.Get()->Init(hdfs_monitor_));
> Ok, great!
Done


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/runtime/io/handle-cache.inline.h@167
PS21, Line 167:   // Opening a file handle requires talking to the NameNode so 
it can take some time.
              :   RETURN_IF_ERROR(accessor_tmp.Get()->Init(hdfs_monitor_));
> Ok, great!
Done


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h
File be/src/util/lru-multi-cache.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h@170
PS21, Line 170:     Container_internal& container;
              :     typename Container_internal::iterator it;
> I was thinking about whether there is a way to avoid having the element kee
I've considered moving this to the accessor, but decided to keep it here to 
spare constructions every time an accessor is created.

I like the idea of having iterator instead of pointer in the accessor, it would 
make it clear it's not something to delete. However, I can't see a clear 
solution for empty Accessor yet:
  - if we want to default initialize the iterator, we would need access to the 
owning container too
  - creating a dummy empty Container in the cache is possible, but feels more 
hacky than this way

In an earlier patchset, I used a unique_ptr with custom deleter 
(cache->Release()). That's still a possibility, making the default destructor 
good enough, however FileHandleCache::Accessor calls explicit release/destroy 
anyways to handle metrics, so I decided to remove it.


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.h@252
PS21, Line 252: ValueType_internal* p_value_internal_;
> Consider making this an iterator to the std::list element
I did, explained it on line 171


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h
File be/src/util/lru-multi-cache.inline.h:

http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@147
PS21, Line 147:       // Move the object to the back of the owning list as it 
is no longer available
              :       container.splice(container.end(), container, 
container_it);
> Ok, that makes sense.
Done


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@175
PS21, Line 175: container.emplace_back((*this), stored_key, container, 
std::forward<Args>(args)...);
> If you wanted to get an iterator back, the regular emplace() call returns a
Thanks, a bit cleaner this way


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@201
PS21, Line 201:   // Move the object to the front, keep LRU relation in owning 
list too to
              :   // be able to age out unused objects
              :   container.splice(container.begin(), container, 
p_value_internal->it);
> Basically, once you decide to move the in_use=true to the end of the list i
Yes, exactly


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@222
PS21, Line 222:   if (container.size() == 1) {
              :     // Last element, owning list can be removed to prevent aging
              :     hash_table_.erase(p_value_internal->key);
              :   } else {
              :     // Remove from owning list
              :     container.erase(p_value_internal->it);
              :   }
> Ok, so can we add a DCHECK that this is in use? Maybe also add a comment th
Done


http://gerrit.cloudera.org:8080/#/c/18191/21/be/src/util/lru-multi-cache.inline.h@251
PS21, Line 251:   // Remove from LRU cache
              :   value_internal.member_hook.unlink();
> Nit: Can we add a DCHECK that this is not in use?
Done



--
To view, visit http://gerrit.cloudera.org:8080/18191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a
Gerrit-Change-Number: 18191
Gerrit-PatchSet: 21
Gerrit-Owner: Gergely Fürnstáhl <gfurnst...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Thu, 24 Feb 2022 13:03:02 +0000
Gerrit-HasComments: Yes

Reply via email to