[ https://issues.apache.org/jira/browse/TIKA-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337885#comment-14337885 ]
Nick Burch commented on TIKA-1560: ---------------------------------- Looks like we might need to move the "is this a valid type" check to before the length fetch / data fetch, rather than after Best bet would be to, as Tim suggests, raise this in the POI bug tracker, attach the sample file there, and ideally work up a patch to shuffle that valid check earlier! > OutOfMemoryError analyzinig specific file > ----------------------------------------- > > Key: TIKA-1560 > URL: https://issues.apache.org/jira/browse/TIKA-1560 > Project: Tika > Issue Type: Bug > Affects Versions: 1.7 > Environment: OS: Ubuntu Linux 12.04 and 14.04 > JVM: OpenJDK 1.7, Oracle JDK 1.7, Oracle JDK 1.8 > Reporter: Rubén Bressler > Attachments: e3284d17-c814-46c1-b33e-ee774305d987.dat > > > I have a specific file when applying tika-app.jar it tries to process ends > with an OOM error. The output is the same no matter what virtual machine or > Tika version is used. > {code} > $ java -jar tika-app-1.7.jar e3284d17-c814-46c1-b33e-ee774305d987.dat > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149) > at java.lang.StringCoding.decode(StringCoding.java:193) > at java.lang.String.<init>(String.java:414) > at org.apache.poi.util.StringUtil.getFromUnicodeLE(StringUtil.java:77) > at > org.apache.poi.hmef.attribute.MAPIAttribute.create(MAPIAttribute.java:149) > at > org.apache.poi.hmef.attribute.TNEFMAPIAttribute.<init>(TNEFMAPIAttribute.java:41) > at > org.apache.poi.hmef.attribute.TNEFAttribute.create(TNEFAttribute.java:67) > at org.apache.poi.hmef.HMEFMessage.processMessage(HMEFMessage.java:97) > at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:79) > at org.apache.poi.hmef.HMEFMessage.<init>(HMEFMessage.java:64) > at org.apache.tika.parser.microsoft.TNEFParser.parse(TNEFParser.java:80) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) > at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:146) > at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:440) > at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:116) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)