[jira] [Commented] (SPARK-41599) Memory leak in FileSystem.CACHE when submitting apps to secure cluster using InProcessLauncher

Xieming Li (Jira) Wed, 21 Jun 2023 07:12:05 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-41599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17735743#comment-17735743
 ]


Xieming Li commented on SPARK-41599:
------------------------------------

Hi, [~ste...@apache.org], Thank you very much for your advice.

After reviewing the code, I think the following PR should be able to fix the 
issue.

May I ask you to take a look at it when you have time, please?

[https://github.com/apache/spark/pull/41692]

 

> Memory leak in FileSystem.CACHE when submitting apps to secure cluster using 
> InProcessLauncher
> ----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-41599
>                 URL: https://issues.apache.org/jira/browse/SPARK-41599
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy, YARN
>    Affects Versions: 3.1.2
>            Reporter: Maciej Smolenski
>            Priority: Major
>         Attachments: InProcLaunchFsIssue.scala, 
> SPARK-41599-fixes-to-limit-FileSystem-CACHE-size-when-using-InProcessLauncher.diff
>
>
> When submitting spark application in kerberos environment the credentials of 
> 'current user' (UserGroupInformation.getCurrentUser()) are being modified.
> Filesystem.CACHE entries contain 'current user' (with user credentials) as a 
> key.
> Submitting many spark applications using InProcessLauncher cause that 
> FileSystem.CACHE becomes bigger and bigger.
> Finally process exits because of OutOfMemory error.
> Code for reproduction attached.
>  
> Output from running 'jmap -histo' on reproduction jvm shows that the number 
> of FileSystem$Cache$Key increases in time:
> time: #instances class
> 1671533274: 2 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533335: 11 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533395: 21 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533455: 30 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533515: 39 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533576: 48 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533636: 57 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533696: 66 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533757: 75 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533817: 84 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533877: 93 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533937: 102 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671533998: 111 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534058: 120 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534118: 135 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534178: 140 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534239: 150 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534299: 159 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534359: 168 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534419: 177 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534480: 186 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534540: 195 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534600: 204 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534661: 213 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534721: 222 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534781: 231 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534841: 240 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534902: 249 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671534962: 257 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671535022: 264 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671535083: 273 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671535143: 282 org.apache.hadoop.fs.FileSystem$Cache$Key
> 1671535203: 291 org.apache.hadoop.fs.FileSystem$Cache$Key



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-41599) Memory leak in FileSystem.CACHE when submitting apps to secure cluster using InProcessLauncher

Reply via email to