[ 
https://issues.apache.org/jira/browse/FLINK-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-9626:
----------------------------------
    Description: 
There is a potential resource leak in 
org.apache.flink.core.fs.FileSystem#getUnguardedFileSystem.

Inside it there is a code:

 
{code:java}
// this "default" initialization makes sure that the FileSystem class works
// even when not configured with an explicit Flink configuration, like on
// JobManager or TaskManager setup
if (FS_FACTORIES.isEmpty()) {
   initialize(new Configuration());
}

{code}
which is executed on each cache miss. However this initialize method is also 
doing

 

 
{code:java}
CACHE.clear();
{code}
without closing file systems in CACHE (this could be problematic for 
HadoopFileSystem which is a wrapper around org.apache.hadoop.fs.FileSystem 
which is closable).

Now if for example we are constantly accessing two different file systems (file 
systems are differentiated by combination of [schema and 
authority|https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax]
 part from the file system's URI) initialized from FALLBACK_FACTORY, each time 
we call getUnguardedFileSystem for one of them, that call will clear from CACHE 
entry for the other one. Thus we will constantly be creating new FileSystems 
without closing them.

Solution could be to either not clear the CACHE or make sure that FileSystems 
are properly closed.

 

  was:
There is a potential resource leak in 
org.apache.flink.core.fs.FileSystem#getUnguardedFileSystem.

Inside it there is a code:

 
{code:java}
// this "default" initialization makes sure that the FileSystem class works
// even when not configured with an explicit Flink configuration, like on
// JobManager or TaskManager setup
if (FS_FACTORIES.isEmpty()) {
   initialize(new Configuration());
}

{code}
which is executed on each cache miss. However this initialize method is also 
doing

 

 
{code:java}
CACHE.clear();
{code}
without closing file systems in CACHE (this could be problematic for 
HadoopFileSystem which is a wrapper around org.apache.hadoop.fs.FileSystem 
which is closable).

Now if for example we are constantly accessing two file systems initialized 
from FALLBACK_FACTORY, each time we call getUnguardedFileSystem for one of 
them, that call will clear from CACHE entry for the other one. Thus we will 
constantly be creating new FileSystems without closing them.

Solution could be to either not clear the CACHE or make sure that FileSystems 
are properly closed.

 


> Possible resource leak in FileSystem
> ------------------------------------
>
>                 Key: FLINK-9626
>                 URL: https://issues.apache.org/jira/browse/FLINK-9626
>             Project: Flink
>          Issue Type: Bug
>          Components: FileSystem
>    Affects Versions: 1.5.0
>            Reporter: Piotr Nowojski
>            Priority: Critical
>
> There is a potential resource leak in 
> org.apache.flink.core.fs.FileSystem#getUnguardedFileSystem.
> Inside it there is a code:
>  
> {code:java}
> // this "default" initialization makes sure that the FileSystem class works
> // even when not configured with an explicit Flink configuration, like on
> // JobManager or TaskManager setup
> if (FS_FACTORIES.isEmpty()) {
>    initialize(new Configuration());
> }
> {code}
> which is executed on each cache miss. However this initialize method is also 
> doing
>  
>  
> {code:java}
> CACHE.clear();
> {code}
> without closing file systems in CACHE (this could be problematic for 
> HadoopFileSystem which is a wrapper around org.apache.hadoop.fs.FileSystem 
> which is closable).
> Now if for example we are constantly accessing two different file systems 
> (file systems are differentiated by combination of [schema and 
> authority|https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax]
>  part from the file system's URI) initialized from FALLBACK_FACTORY, each 
> time we call getUnguardedFileSystem for one of them, that call will clear 
> from CACHE entry for the other one. Thus we will constantly be creating new 
> FileSystems without closing them.
> Solution could be to either not clear the CACHE or make sure that FileSystems 
> are properly closed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to