[ https://issues.apache.org/jira/browse/TIKA-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tyler Palsulich closed TIKA-1460. --------------------------------- Resolution: Cannot Reproduce Closing as Cannot Reproduce, since it's been a month since my last comment and we don't have the file which reproduces the issue. Please reopen if you're still running into this! > Could not parse predefined CMAP file for 'Adobe-GBK1-UCS2' > ---------------------------------------------------------- > > Key: TIKA-1460 > URL: https://issues.apache.org/jira/browse/TIKA-1460 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.3 > Environment: win7,myeclipse8.5 > Reporter: onyas > Priority: Critical > > for some reason,I could not upload the file,Here is the info.. > and i checked all the version in the directory of > \org\apache\pdfbox\resources\cmap, I have not found the ’Adobe-GBK1-UCS2‘ file > org.apache.tika.exception.TikaException: Unexpected RuntimeException from > org.apache.tika.parser.microsoft.OfficeParser@d640af > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) > Caused by: java.lang.IllegalArgumentException: Position 66048 past the end of > the file > at > org.apache.poi.poifs.nio.FileBackedDataSource.read(FileBackedDataSource.java:50) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getBlockAt(NPOIFSFileSystem.java:420) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readBAT(NPOIFSFileSystem.java:397) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readCoreContents(NPOIFSFileSystem.java:356) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:202) > at > org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:184) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:156) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) > ... 21 more > the major code is : > Parser parser = new AutoDetectParser(); > ContentHandler handler = new BodyContentHandler(getNum()); > Metadata metadata = new Metadata(); > ParseContext context = new ParseContext(); > InputStream stream = null; > StringBuffer content = new StringBuffer(); > try { > stream = new FileInputStream(file); > if (stream != null) { > parser.parse(stream, handler, metadata, > context); > content = content.append(handler); > > if(StringUtils.isNotBlank(content.toString())){ > hasContent = true; > handler = null; > metadata = null; > context = null; > } > } > And the exception is throwed at this line== parser.parse(stream, handler, > metadata, context); -- This message was sent by Atlassian JIRA (v6.3.4#6332)