Ádám Szita created HIVE-26432:
---------------------------------
Summary: Improve LlapCacheAwareFs by caching file status
information
Key: HIVE-26432
URL: https://issues.apache.org/jira/browse/HIVE-26432
Project: Hive
Issue Type: Improvement
Reporter: Ádám Szita
Assignee: Ádám Szita
The current implementation of LlapCacheAwareFs is used to wrap InputStreams of
non-ORC file formatted file reads, if set up to utilize LLAP caching.
File content is cached by the calculated file ID and the required offsets
within the file. This is later served from cache, however LlapCacheAwareFs
acting as a FileSystem sometimes receives listStatus / getFileStatus calls too,
which is only proxied to the original FS. If such operation on the original FS
is slow, e.g. listing on S3, performance will be impacted. (This is not the
case with how ORC is integrated into LLAP cache as it's not acting as a FS)
I propose we cache the file status information too besides the content.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)