[jira] [Commented] (FLINK-9626) Possible resource leak in FileSystem

2021-04-22 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17328619#comment-17328619
 ] 

Flink Jira Bot commented on FLINK-9626:
---

This major issue is unassigned and itself and all of its Sub-Tasks have not 
been updated for 30 days. So, it has been labeled "stale-major". If this ticket 
is indeed "major", please either assign yourself or give an update. Afterwards, 
please remove the label. In 7 days the issue will be deprioritized.

> Possible resource leak in FileSystem
> 
>
> Key: FLINK-9626
> URL: https://issues.apache.org/jira/browse/FLINK-9626
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems
>Affects Versions: 1.5.0
>Reporter: Piotr Nowojski
>Priority: Major
>  Labels: stale-major
>
> There is a potential resource leak in 
> org.apache.flink.core.fs.FileSystem#getUnguardedFileSystem.
> Inside it there is a code:
>  
> {code:java}
> // this "default" initialization makes sure that the FileSystem class works
> // even when not configured with an explicit Flink configuration, like on
> // JobManager or TaskManager setup
> if (FS_FACTORIES.isEmpty()) {
>initialize(new Configuration());
> }
> {code}
> which is executed on each cache miss. However this initialize method is also 
> doing
>  
>  
> {code:java}
> CACHE.clear();
> {code}
> without closing file systems in CACHE (this could be problematic for 
> HadoopFileSystem which is a wrapper around org.apache.hadoop.fs.FileSystem 
> which is closable).
> Now if for example we are constantly accessing two different file systems 
> (file systems are differentiated by combination of [schema and 
> authority|https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax]
>  part from the file system's URI) initialized from FALLBACK_FACTORY, each 
> time we call getUnguardedFileSystem for one of them, that call will clear 
> from CACHE entry for the other one. Thus we will constantly be creating new 
> FileSystems without closing them.
> Solution could be to either not clear the CACHE or make sure that FileSystems 
> are properly closed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-9626) Possible resource leak in FileSystem

2018-08-24 Thread Piotr Nowojski (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591323#comment-16591323
 ] 

Piotr Nowojski commented on FLINK-9626:
---

I have reported this as a candidate to explain resource leak reported by a 
user, however in the meantime we have found much more likely cause of it: 
https://issues.apache.org/jira/browse/HADOOP-15658 so I'm decreasing priority 
of this issue.

> Possible resource leak in FileSystem
> 
>
> Key: FLINK-9626
> URL: https://issues.apache.org/jira/browse/FLINK-9626
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem
>Affects Versions: 1.5.0
>Reporter: Piotr Nowojski
>Priority: Major
>
> There is a potential resource leak in 
> org.apache.flink.core.fs.FileSystem#getUnguardedFileSystem.
> Inside it there is a code:
>  
> {code:java}
> // this "default" initialization makes sure that the FileSystem class works
> // even when not configured with an explicit Flink configuration, like on
> // JobManager or TaskManager setup
> if (FS_FACTORIES.isEmpty()) {
>initialize(new Configuration());
> }
> {code}
> which is executed on each cache miss. However this initialize method is also 
> doing
>  
>  
> {code:java}
> CACHE.clear();
> {code}
> without closing file systems in CACHE (this could be problematic for 
> HadoopFileSystem which is a wrapper around org.apache.hadoop.fs.FileSystem 
> which is closable).
> Now if for example we are constantly accessing two different file systems 
> (file systems are differentiated by combination of [schema and 
> authority|https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax]
>  part from the file system's URI) initialized from FALLBACK_FACTORY, each 
> time we call getUnguardedFileSystem for one of them, that call will clear 
> from CACHE entry for the other one. Thus we will constantly be creating new 
> FileSystems without closing them.
> Solution could be to either not clear the CACHE or make sure that FileSystems 
> are properly closed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9626) Possible resource leak in FileSystem

2018-08-23 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16590612#comment-16590612
 ] 

Stephan Ewen commented on FLINK-9626:
-

Note, that this happens only if there is no other FS factory in the classpath, 
which is not commonly the case. So in practice, this should not really happen 
outside test setups.

> Possible resource leak in FileSystem
> 
>
> Key: FLINK-9626
> URL: https://issues.apache.org/jira/browse/FLINK-9626
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem
>Affects Versions: 1.5.0
>Reporter: Piotr Nowojski
>Priority: Critical
>
> There is a potential resource leak in 
> org.apache.flink.core.fs.FileSystem#getUnguardedFileSystem.
> Inside it there is a code:
>  
> {code:java}
> // this "default" initialization makes sure that the FileSystem class works
> // even when not configured with an explicit Flink configuration, like on
> // JobManager or TaskManager setup
> if (FS_FACTORIES.isEmpty()) {
>initialize(new Configuration());
> }
> {code}
> which is executed on each cache miss. However this initialize method is also 
> doing
>  
>  
> {code:java}
> CACHE.clear();
> {code}
> without closing file systems in CACHE (this could be problematic for 
> HadoopFileSystem which is a wrapper around org.apache.hadoop.fs.FileSystem 
> which is closable).
> Now if for example we are constantly accessing two different file systems 
> (file systems are differentiated by combination of [schema and 
> authority|https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax]
>  part from the file system's URI) initialized from FALLBACK_FACTORY, each 
> time we call getUnguardedFileSystem for one of them, that call will clear 
> from CACHE entry for the other one. Thus we will constantly be creating new 
> FileSystems without closing them.
> Solution could be to either not clear the CACHE or make sure that FileSystems 
> are properly closed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9626) Possible resource leak in FileSystem

2018-07-06 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534559#comment-16534559
 ] 

Stephan Ewen commented on FLINK-9626:
-

Good point. We should have a flag {{initialized}} and not rely on the 
collection being empty.

Re-initialization could close all file systems - it should never happen and is 
only possible for better testability.

> Possible resource leak in FileSystem
> 
>
> Key: FLINK-9626
> URL: https://issues.apache.org/jira/browse/FLINK-9626
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem
>Affects Versions: 1.5.0
>Reporter: Piotr Nowojski
>Priority: Critical
>
> There is a potential resource leak in 
> org.apache.flink.core.fs.FileSystem#getUnguardedFileSystem.
> Inside it there is a code:
>  
> {code:java}
> // this "default" initialization makes sure that the FileSystem class works
> // even when not configured with an explicit Flink configuration, like on
> // JobManager or TaskManager setup
> if (FS_FACTORIES.isEmpty()) {
>initialize(new Configuration());
> }
> {code}
> which is executed on each cache miss. However this initialize method is also 
> doing
>  
>  
> {code:java}
> CACHE.clear();
> {code}
> without closing file systems in CACHE (this could be problematic for 
> HadoopFileSystem which is a wrapper around org.apache.hadoop.fs.FileSystem 
> which is closable).
> Now if for example we are constantly accessing two different file systems 
> (file systems are differentiated by combination of [schema and 
> authority|https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax]
>  part from the file system's URI) initialized from FALLBACK_FACTORY, each 
> time we call getUnguardedFileSystem for one of them, that call will clear 
> from CACHE entry for the other one. Thus we will constantly be creating new 
> FileSystems without closing them.
> Solution could be to either not clear the CACHE or make sure that FileSystems 
> are properly closed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9626) Possible resource leak in FileSystem

2018-06-20 Thread Stefan Richter (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518096#comment-16518096
 ] 

Stefan Richter commented on FLINK-9626:
---

I think in general it can be dangerous to call a {{close}} on filesystems if 
the cache is cleared because some client might still reference and use the 
instance of the filesystem. Options could be that clients also need to 
{{close}} filesystems and we do a reference counting that has to hit zero 
before the instance is actually closed internally. Or we could use a safety net 
based on {{PhantomReference}}, similar to what we do with guarding input/output 
streams. Or a combination of both. Might also be good to hear [~StephanEwen] 
about this.

> Possible resource leak in FileSystem
> 
>
> Key: FLINK-9626
> URL: https://issues.apache.org/jira/browse/FLINK-9626
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem
>Affects Versions: 1.5.0
>Reporter: Piotr Nowojski
>Priority: Critical
>
> There is a potential resource leak in 
> org.apache.flink.core.fs.FileSystem#getUnguardedFileSystem.
> Inside it there is a code:
>  
> {code:java}
> // this "default" initialization makes sure that the FileSystem class works
> // even when not configured with an explicit Flink configuration, like on
> // JobManager or TaskManager setup
> if (FS_FACTORIES.isEmpty()) {
>initialize(new Configuration());
> }
> {code}
> which is executed on each cache miss. However this initialize method is also 
> doing
>  
>  
> {code:java}
> CACHE.clear();
> {code}
> without closing file systems in CACHE (this could be problematic for 
> HadoopFileSystem which is a wrapper around org.apache.hadoop.fs.FileSystem 
> which is closable).
> Now if for example we are constantly accessing two different file systems 
> (file systems are differentiated by combination of [schema and 
> authority|https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax]
>  part from the file system's URI) initialized from FALLBACK_FACTORY, each 
> time we call getUnguardedFileSystem for one of them, that call will clear 
> from CACHE entry for the other one. Thus we will constantly be creating new 
> FileSystems without closing them.
> Solution could be to either not clear the CACHE or make sure that FileSystems 
> are properly closed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)