[ 
https://issues.apache.org/jira/browse/TIKA-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916534#action_12916534
 ] 

Stephen Duncan Jr commented on TIKA-521:
----------------------------------------

I have 7MB files that can't be handled when giving 2GB of RAM, it required 3GB 
to process.  I'm looking at likely needing to run on 32-bit Java, so increasing 
the heap size that high is not really an option.  Besides, at the growth rate I 
see, a 20MB file might require 10GB of heap.  That simply doesn't scale for 
reasonable file sizes.  Meanwhile, the same 7MB file can be parsed using the 
alternate API using 128MB for the heap size.  That should allow any reasonable 
file to be processed assuming a reasonable 1GB heap size.

> OutOfMemoryError Parsing XSLX File
> ----------------------------------
>
>                 Key: TIKA-521
>                 URL: https://issues.apache.org/jira/browse/TIKA-521
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 0.7, 0.8
>            Reporter: Stephen Duncan Jr
>         Attachments: memory-test.xlsx
>
>
> I have several XSLX files I'm trying to parse with Tika that are failing with 
> an OutOfMemoryError even when using  a large heap size.  For instance the 
> attached 1.26MB excel file fails using a 512MB heap.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to