[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-11-12 Thread Melloware (JIRA)


[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16684043#comment-16684043
 ] 

Melloware commented on COMPRESS-446:


Just in case you were interested or wanted to review attached is what I did.  I 
basically start a timer and I cancel the timer if the delete succeeds.  IF the 
delete doesn't succeed it will try the delete again when the timer expires or 
if the delete was never called. [^FailSafeScatterGatherBackingStore.java] 

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
>  Labels: zip
> Fix For: 1.17
>
> Attachments: FailSafeScatterGatherBackingStore.java
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-11-10 Thread Melloware (JIRA)


[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682374#comment-16682374
 ] 

Melloware commented on COMPRESS-446:


OK I am using raw commons compress so we should be able to attempt this.

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
>  Labels: zip
> Fix For: 1.17
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-11-09 Thread Stefan Bodewig (JIRA)


[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682209#comment-16682209
 ] 

Stefan Bodewig commented on COMPRESS-446:
-

The one you are currently using implicitly is 
https://github.com/apache/commons-compress/blob/master/src/main/java/org/apache/commons/compress/parallel/FileBasedScatterGatherBackingStore.java

By using the two-arg constructor of {{ParallelScatterZipCreator}} you can plug 
in your own {{ScatterGatherBackingStoreSupplier}} which has a single method 
that is responsible for creating a new {{ScatterGatherBackingStore}}. This is 
assuming you are creating {{ParallelScatterZipCreator}} yourself and not using 
a third-party library that abstracts that away from you.

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
>  Labels: zip
> Fix For: 1.17
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-11-09 Thread Melloware (JIRA)


[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681942#comment-16681942
 ] 

Melloware commented on COMPRESS-446:


I might be able to do that. Can you provide an example of 
ScatterGatherBackingStoreSupplier and how I would inject it into the process to 
clean up these temp files?   

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
>  Labels: zip
> Fix For: 1.17
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-11-09 Thread Stefan Bodewig (JIRA)


[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681764#comment-16681764
 ] 

Stefan Bodewig commented on COMPRESS-446:
-

You are correct, I've opened COMPRESS-470 to track this other leak.

Not sure whether this is an option for you, you could work around this with 
providing a {{ScatterGatherBackingStoreSupplier}} of your own that cleaned up 
resources after a certain amount of time itself. Given you are running a 
process that never stops I'm not even sure it would be enough to call {{close}} 
on {{FileBasedScatterGatherBackingStore}} as the call to {{delete}} may fail as 
well and using {{deleteOnExit}} as a hypothetical fallback won't help you.

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
>  Labels: zip
> Fix For: 1.17
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-11-08 Thread Melloware (JIRA)


[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679878#comment-16679878
 ] 

Melloware commented on COMPRESS-446:


We are still seeing this error on Commons Compress 1.18 intermittently leaving 
these TMP files for ParallelScatter.  Our system runs 24/7 and is zipping files 
hourly so over a 1 month period we have seen it leave say 4-6 of these files 
around.  The problem is they don't get cleaned up and fills our /TMP eventually 
to 100%.

Any thoughts?

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
>  Labels: zip
> Fix For: 1.17
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-03-31 Thread Gary Gregory (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421319#comment-16421319
 ] 

Gary Gregory commented on COMPRESS-446:
---

I suppose you are correct, not as bad as I initially thought.

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
> Fix For: 1.17
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-03-31 Thread Stefan Bodewig (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421249#comment-16421249
 ] 

Stefan Bodewig commented on COMPRESS-446:
-

Hmm, I don't see this as serious as you seem to do.

The resource leak happens if one of the parallel threads throws an exception 
which will likely propagate to the caller and in code that hasn't been crafted  
as carefully as Christoph's will kill the whole process rendering the resource 
leak moot. This is an edge case in a class that isn't likely to be used by many 
people at all.

Do you consider this more serious than the "usual" bugs?

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
> Fix For: 1.17
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-03-29 Thread Gary Gregory (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419102#comment-16419102
 ] 

Gary Gregory commented on COMPRESS-446:
---

Hi [~bodewig]: This looks important. Are you planning on cutting a release soon?

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
> Fix For: 1.17
>
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-03-20 Thread Christoph Ludwig (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406315#comment-16406315
 ] 

Christoph Ludwig commented on COMPRESS-446:
---

I think shutting down the executor service in a finally block is fine - this 
way all threads are closed and the objects reachable from there get a chance to 
be re-claimed.

In addition, I think it would be worthwhile to document explicitly that the ZIP 
creation fails irrecoverably if any of the  futures  throws an exception. It 
does not come as a surprise, of course, but It helps if the documentation is 
clear on this.
 

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (COMPRESS-446) Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)

2018-03-20 Thread Stefan Bodewig (JIRA)

[ 
https://issues.apache.org/jira/browse/COMPRESS-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405985#comment-16405985
 ] 

Stefan Bodewig commented on COMPRESS-446:
-

This is ugly. An alternative would be to shutdown the executor in a finally 
block. Would that be sufficient?

[~krosenvold] any ideas?

> Resource Leak in ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)
> --
>
> Key: COMPRESS-446
> URL: https://issues.apache.org/jira/browse/COMPRESS-446
> Project: Commons Compress
>  Issue Type: Bug
>  Components: Archivers
>Affects Versions: 1.16.1
> Environment: The application was running inside a Docker container, 
> the JVM had about 1.7 GByte heap space.
>Reporter: Christoph Ludwig
>Priority: Major
>
> Before it does anything else, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} loops over all 
> futures returned by the creator`s executor service and calls 
> {{Future#get()}}. This will block until the future's computation is 
> completed, respectively - i.e., until all entries have been written to the 
> thread-local scatter streams.
> However, if the computation of a future fails, then {{Future#get()}} can also 
> throw an exception. This exception escapes 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}} before the 
> executor service is shut down. The latter means that also the thread-local 
> variables in the executor service's threads and all objects referenced by 
> them continue to exist and cannot be reclaimed by the GC.
> I encountered this situation when - while processing an archive with 130,000 
> documents - the JVM threw an {{OutOfMemoryError}}. The application was not 
> able to recover from this OOM error because most of the heap was occupied by 
> objects reachable from the executor service's threads.
> Of course, the OOM is mostly the fault of my own code; I will be able to work 
> around the "leaked" executor service because I supply it in the first place 
> and can therefore shut it down if I detect an error situation.  
> The effect would be the same, though, if, say, {{Future#get()}} throws an 
> {{InterruptedException}}. Therefore, 
> {{ParallelScatterZipCreator#writeTo(ZipArchiveOutputStream)}}  should either 
> shut down and release all resources if it cannot complete its task due to an 
> Exception thrown by a future or it should offer a reasonable recovery 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)