[ 
https://issues.apache.org/jira/browse/HIVE-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-27884:
--------------------------------
    Description: 
Originally, when the task runner was added to HIVE-10028 
([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
 the FileSystem.closeAllForUGI was commented out for some reasons, and then, in 
the scope of HIVE-9898 it was simply added back, 
[here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]

A FileSystem.close call basically does the 
[following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
1. delete all paths that were marked as delete-on-exit.
2. removes the instance from the cache

I saw that we 
[call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
{code:java}
FileSystem.closeAllForUGI
{code}
at the end of all task attempts, so we almost completely disable hadoop's 
filesystem cache during a long-running LLAP daemon lifecycle

some investigations on azure showed that creating a filesystem can be quite 
expensive, as it involves the recreation of a whole object hierarchy like:
{code:java}
AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
TokenProvider(MsiTokenProvider)
{code}
which ends up pinging the token auth endpoint of azure, leading to e.g. a HTTP 
response 429

the other area that's really affected by this patch is the aws sdk v2, where we 
discovered performance degradation (github issues also imply this problem), 
look at:
!Screenshot 2024-07-30 at 10.18.13.png|width=808,height=425!

this screenshot is just for reference: I mean it doesn't prove the perf 
degradation (because it was only visible with wall clock profiling), 

 

additionally: deleteOnExit, please refer to 

We need to check whether we can remove this closeAllForUGI in LLAP, 
additionally check and remove all deleteOnExit calls that belong to hadoop 
FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
calls):
{code:java}
grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
...
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:        // in 
recent hadoop versions, use deleteOnExit to clean tmp files.
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:        
autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
        fileSystem.deleteOnExit(dir);
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java:      
parentDir.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java:      
tmpFile.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:  
      parentDir.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:  
      tmpFile.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:    
    tmpFile.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:      
  autoDelete = fs.deleteOnExit(outPath);
{code}
I believe deleteOnExit is fine if we don't want to bother with removing temp 
files, however, these deletions might want to go to a more hive-specific scope 
if we want to really reuse cached filesystems in a safe manner.

  was:
Originally, when the task runner was added to HIVE-10028 
([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
 the FileSystem.closeAllForUGI was commented out for some reasons, and then, in 
the scope of HIVE-9898 it was simply added back, 
[here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]

A FileSystem.close call basically does the 
[following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
1. delete all paths that were marked as delete-on-exit.
2. removes the instance from the cache

I saw that we 
[call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
 
{code}
FileSystem.closeAllForUGI
{code}
at the end of all task attempts, so we almost completely disable hadoop's 
filesystem cache during a long-running LLAP daemon lifecycle

some investigations on azure showed that creating a filesystem can be quite 
expensive, as it involves the recreation of a whole object hierarchy like:
{code}
AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
TokenProvider(MsiTokenProvider)
{code}
which ends up pinging the token auth endpoint of azure, leading to e.g. a HTTP 
response 429


the other area that's really affected by this patch is the aws sdk  v2, where 
we discovered performance degradation (github issues also imply this problem), 
look at:
 !Screenshot 2024-07-30 at 10.18.13.png! 


We need to check whether we can remove this closeAllForUGI in LLAP, 
additionally check and remove all deleteOnExit calls that belong to hadoop 
FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
calls):
{code}
grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
...
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:        // in 
recent hadoop versions, use deleteOnExit to clean tmp files.
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:        
autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
        fileSystem.deleteOnExit(dir);
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java:      
parentDir.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java:      
tmpFile.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:  
      parentDir.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:  
      tmpFile.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:    
    tmpFile.deleteOnExit();
ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:      
  autoDelete = fs.deleteOnExit(outPath);
{code}
I believe deleteOnExit is fine if we don't want to bother with removing temp 
files, however, these deletions might want to go to a more hive-specific scope 
if we want to really reuse cached filesystems in a safe manner.


> LLAP: Reuse FileSystem objects from cache across different tasks in the same 
> LLAP daemon
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-27884
>                 URL: https://issues.apache.org/jira/browse/HIVE-27884
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: Screenshot 2024-07-30 at 10.18.13.png
>
>
> Originally, when the task runner was added to HIVE-10028 
> ([here|https://github.com/apache/hive/blob/23f40bd88043db3cb4efe3a763cbfd5c01a81d2f/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L202]),
>  the FileSystem.closeAllForUGI was commented out for some reasons, and then, 
> in the scope of HIVE-9898 it was simply added back, 
> [here|https://github.com/apache/hive/commit/91c46a44dd9fbb68d01f22e93c4ce0931a4598e0#diff-270dbe6639879ca543ae21c44a239af6145390726d45fee832be809894bfc88eR236]
> A FileSystem.close call basically does the 
> [following|https://github.com/apache/hadoop/blob/0c10bab7bb77aa4ea3ca26c899ab28131561e052/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2700-L2710]:
> 1. delete all paths that were marked as delete-on-exit.
> 2. removes the instance from the cache
> I saw that we 
> [call|https://github.com/apache/hive/blob/eb6f0b0c57dd55335927b7dde08cd47f4d00e74d/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java#L302]
> {code:java}
> FileSystem.closeAllForUGI
> {code}
> at the end of all task attempts, so we almost completely disable hadoop's 
> filesystem cache during a long-running LLAP daemon lifecycle
> some investigations on azure showed that creating a filesystem can be quite 
> expensive, as it involves the recreation of a whole object hierarchy like:
> {code:java}
> AzureBlobFileSystem -> AzureBlobFileSystemStore --> AbfsClient -> 
> TokenProvider(MsiTokenProvider)
> {code}
> which ends up pinging the token auth endpoint of azure, leading to e.g. a 
> HTTP response 429
> the other area that's really affected by this patch is the aws sdk v2, where 
> we discovered performance degradation (github issues also imply this 
> problem), look at:
> !Screenshot 2024-07-30 at 10.18.13.png|width=808,height=425!
> this screenshot is just for reference: I mean it doesn't prove the perf 
> degradation (because it was only visible with wall clock profiling), 
>  
> additionally: deleteOnExit, please refer to 
> We need to check whether we can remove this closeAllForUGI in LLAP, 
> additionally check and remove all deleteOnExit calls that belong to hadoop 
> FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit 
> calls):
> {code:java}
> grep -iRH "deleteOnExit" --include="*.java" | grep -v "test"
> ...
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:        // 
> in recent hadoop versions, use deleteOnExit to clean tmp files.
> ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:        
> autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]);
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java:
>         fileSystem.deleteOnExit(dir);
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java:     
>  parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java:     
>  tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
>         parentDir.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java:
>         tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java:  
>       tmpFile.deleteOnExit();
> ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java:    
>     autoDelete = fs.deleteOnExit(outPath);
> {code}
> I believe deleteOnExit is fine if we don't want to bother with removing temp 
> files, however, these deletions might want to go to a more hive-specific 
> scope if we want to really reuse cached filesystems in a safe manner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to