[ https://issues.apache.org/jira/browse/TIKA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17574730#comment-17574730 ]
Tim Allison commented on TIKA-3829: ----------------------------------- It is difficult to tell without a triggering file. At least in the development version of 1.x, you get to this warning if the DirectoryNode has a key for summary information, but the value in the map for that key is null. Tika's code: {noformat} if (! root.hasEntry(entryName)) { return; } DocumentEntry entry = (DocumentEntry) root.getEntry(entryName); {noformat} POI's code: {noformat} public boolean hasEntry(String name) { return name != null && this._byname.containsKey(name); } public Entry getEntry(String name) throws FileNotFoundException { Entry rval = null; if (name != null) { rval = (Entry)this._byname.get(name); } if (rval == null) { if (this._byname.containsKey("Workbook")) { throw new IllegalArgumentException("The document is really a XLS file"); ... {noformat} > java.lang.IllegalArgumentException: The document is really a XLS file > exception while parsing doc file > ------------------------------------------------------------------------------------------------------ > > Key: TIKA-3829 > URL: https://issues.apache.org/jira/browse/TIKA-3829 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.23 > Reporter: John > Priority: Major > > Getting following exception while parsing doc file: > WARN Ignoring unexpected exception while parsing summary entry > DocumentSummaryInformation > java.lang.IllegalArgumentException: The document is really a XLS file > at > org.apache.poi.poifs.filesystem.DirectoryNode.getEntry(DirectoryNode.java:322) > at > org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:82) > at > org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:74) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:155) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) > > What is the meaning of this exception? when it will be thrown? -- This message was sent by Atlassian Jira (v8.20.10#820010)