[ 
https://issues.apache.org/jira/browse/COMPRESS-382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969413#comment-15969413
 ] 

Tim Allison commented on COMPRESS-382:
--------------------------------------

This works, but I don't like breaking the simplicity of 
CompressorStreamFactory.  So, now we're going to have a bunch of different 
settable parameters that require knowledge of the individual child streams?  I 
don't like it.

bq.  The current thinking behind this is that if you know you want certain 
parameters, then you know which format you want and don't need to go through 
the factory anyway.

In Tika land, this isn't quite right, we know we want to set limits, and we're 
willing to figure out that lzma is in kb and Z is in bytes for the data table, 
etc., but we don't (currently) know the stream type.  We could (soon) detect() 
and then call the relevant Compressor, but it would be easier to tell the 
factory what we want for each file type up front.

As for service loading...y, that makes parameterization challenging.  

Recommendations?

> OutOfMemoryError from CompressorStreamFactory
> ---------------------------------------------
>
>                 Key: COMPRESS-382
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-382
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.10, 1.11, 1.12
>         Environment: Windows7, jre1.8.0_101 x64
>            Reporter: Luis Filipe Nassif
>         Attachments: data.mui
>
>
> While using Tika-1.14 to detect file types, the attached 1KB file triggered 
> an OOME with 1GB heap. Tika calls 
> CompressorStreamFactory.createCompressorInputStream(in) to detect if the file 
> is a compressor stream, but CompressorStreamFactory erroneously detects it as 
> a LZMACompressorInputStream and when the LZMACompressorInputStream is 
> instanciated the OOME is thrown. This error does not happen with 
> commons-compress versions prior to 1.10, when auto detecting LZMA streams was 
> added. OOME stacktrace below:
> {code}
> Caused by: java.lang.OutOfMemoryError: Java heap space
>       at org.tukaani.xz.lz.LZDecoder.<init>(Unknown Source) ~[xz-1.5.jar:1.5]
>       at org.tukaani.xz.LZMAInputStream.initialize(Unknown Source) 
> ~[xz-1.5.jar:1.5]
>       at org.tukaani.xz.LZMAInputStream.initialize(Unknown Source) 
> ~[xz-1.5.jar:1.5]
>       at org.tukaani.xz.LZMAInputStream.<init>(Unknown Source) 
> ~[xz-1.5.jar:1.5]
>       at org.tukaani.xz.LZMAInputStream.<init>(Unknown Source) 
> ~[xz-1.5.jar:1.5]
>       at 
> org.apache.commons.compress.compressors.lzma.LZMACompressorInputStream.<init>(LZMACompressorInputStream.java:48)
>  ~[commons-compress-1.10.jar:1.10]
>       at 
> org.apache.commons.compress.compressors.CompressorStreamFactory.createCompressorInputStream(CompressorStreamFactory.java:251)
>  ~[commons-compress-1.10.jar:1.10]
>       at 
> org.apache.tika.parser.pkg.ZipContainerDetector.detectCompressorFormat(ZipContainerDetector.java:109)
>  ~[tika-parsers-1.14.jar:1.14]
>       at 
> org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:95)
>  ~[tika-parsers-1.14.jar:1.14]
>       at 
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:77) 
> ~[tika-core-1.14.jar:1.14]
>       at 
> dpf.sp.gpinf.indexer.process.task.SignatureTask.process(SignatureTask.java:50)
>  ~[iped.jar:?]
>       at 
> dpf.sp.gpinf.indexer.process.task.AbstractTask.processMonitorTimeout(AbstractTask.java:203)
>  ~[iped.jar:?]
>       at 
> dpf.sp.gpinf.indexer.process.task.AbstractTask.processAndSendToNextTask(AbstractTask.java:152)
>  ~[iped.jar:?]
>       at 
> dpf.sp.gpinf.indexer.process.task.AbstractTask.sendToNextTask(AbstractTask.java:190)
>  ~[iped.jar:?]
>       at 
> dpf.sp.gpinf.indexer.process.task.AbstractTask.processAndSendToNextTask(AbstractTask.java:160)
>  ~[iped.jar:?]
>       at 
> dpf.sp.gpinf.indexer.process.task.AbstractTask.sendToNextTask(AbstractTask.java:190)
>  ~[iped.jar:?]
>       at 
> dpf.sp.gpinf.indexer.process.task.AbstractTask.processAndSendToNextTask(AbstractTask.java:160)
>  ~[iped.jar:?]
>       at 
> dpf.sp.gpinf.indexer.process.task.AbstractTask.sendToNextTask(AbstractTask.java:190)
>  ~[iped.jar:?]
>       at 
> dpf.sp.gpinf.indexer.process.task.AbstractTask.processAndSendToNextTask(AbstractTask.java:160)
>  ~[iped.jar:?]
>       at dpf.sp.gpinf.indexer.process.Worker.process(Worker.java:174) 
> ~[iped.jar:?]
>       ... 1 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to