Ádám Szita created HIVE-26432:
---------------------------------

             Summary: Improve LlapCacheAwareFs by caching file status 
information
                 Key: HIVE-26432
                 URL: https://issues.apache.org/jira/browse/HIVE-26432
             Project: Hive
          Issue Type: Improvement
            Reporter: Ádám Szita
            Assignee: Ádám Szita


The current implementation of LlapCacheAwareFs is used to wrap InputStreams of 
non-ORC file formatted file reads, if set up to utilize LLAP caching.

File content is cached by the calculated file ID and the required offsets 
within the file. This is later served from cache, however LlapCacheAwareFs 
acting as a FileSystem sometimes receives listStatus / getFileStatus calls too, 
which is only proxied to the original FS. If such operation on the original FS 
is slow, e.g. listing on S3, performance will be impacted. (This is not the 
case with how ORC is integrated into LLAP cache as it's not acting as a FS)

I propose we cache the file status information too besides the content.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to