[ https://issues.apache.org/jira/browse/TIKA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839757#comment-15839757 ]
Tim Allison commented on TIKA-2252: ----------------------------------- Thank you. Do you have any idea what type of files those are? Tika detects 49012.dat as an mpeg and correctly parses it. STG32169.dat is identified as some form of a POIFS system, and I get the same exception you do...but I can't open it in any application. > could not parse document > ------------------------ > > Key: TIKA-2252 > URL: https://issues.apache.org/jira/browse/TIKA-2252 > Project: Tika > Issue Type: Bug > Reporter: yousef abu elbeh > Attachments: STG32169.dat, STG49012.dat > > > Hi > i am using Tika to parse a document but each time i saw this error: > org.apache.tika.exception.TikaException: Unexpected RuntimeException from > org.apache.tika.parser.microsoft.OfficeParser@124d02b2 > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) > at org.apache.tika.Tika.parseToString(Tika.java:527) > at org.apache.tika.Tika.parseToString(Tika.java:602) > at > com.ligadata.datapreprocessing.fileutility.PSTReader.writePSTFile(PSTReader.java:79) > at > com.ligadata.datapreprocessing.fileutility.PSTReader.processFolder(PSTReader.java:55) > at > com.ligadata.datapreprocessing.fileutility.PSTReader.processFolder(PSTReader.java:45) > at > com.ligadata.datapreprocessing.fileutility.PSTReader.processFolder(PSTReader.java:45) > at > com.ligadata.datapreprocessing.fileutility.PSTReader.processFolder(PSTReader.java:45) > at > com.ligadata.datapreprocessing.fileutility.PSTReader.readPSTFile(PSTReader.java:27) > at > com.ligadata.datapreprocessing.emailextracter.MainClass.main(MainClass.java:61) > Caused by: java.lang.IllegalArgumentException: Position 313856 past the end > of the file > at > org.apache.poi.poifs.nio.FileBackedDataSource.read(FileBackedDataSource.java:88) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getBlockAt(NPOIFSFileSystem.java:484) > at > org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:169) > at > org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:142) > at > org.apache.poi.poifs.property.NPropertyTable.buildProperties(NPropertyTable.java:87) > at org.apache.poi.poifs.property.NPropertyTable.<init>(NPropertyTable.java:66) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readCoreContents(NPOIFSFileSystem.java:440) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:235) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:168) > at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:109) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > ... 11 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)