[ https://issues.apache.org/jira/browse/TIKA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17574659#comment-17574659 ]
Dhanabal commented on TIKA-3829: -------------------------------- Currently, i don't have the file. i also getting exception for DocumentSummaryInformation as well. ---------------------------------------------------------------- WARN Ignoring unexpected exception while parsing summary entry DocumentSummaryInformation java.lang.IllegalArgumentException: The document is really a XLS file at org.apache.poi.poifs.filesystem.DirectoryNode.getEntry(DirectoryNode.java:322) at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:82) at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:74) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:155) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:104) at org.apache.tika.extractor.EmbeddedDocumentUtil.parseEmbedded(EmbeddedDocumentUtil.java:220) at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:261) at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:146) at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:229) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:175) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) ------------------------------------------------------------------------------------ WARN Ignoring unexpected exception while parsing summary entry SummaryInformation java.lang.IllegalArgumentException: The document is really a XLS file at org.apache.poi.poifs.filesystem.DirectoryNode.getEntry(DirectoryNode.java:322) at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:82) at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:73) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:155) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:104) at org.apache.tika.extractor.EmbeddedDocumentUtil.parseEmbedded(EmbeddedDocumentUtil.java:220) at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:261) at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:146) at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:229) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:175) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) > java.lang.IllegalArgumentException: The document is really a XLS file > exception while parsing doc file > ------------------------------------------------------------------------------------------------------ > > Key: TIKA-3829 > URL: https://issues.apache.org/jira/browse/TIKA-3829 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.23 > Reporter: Dhanabal > Priority: Major > > Getting following exception while parsing doc file: > WARN Ignoring unexpected exception while parsing summary entry > DocumentSummaryInformation > java.lang.IllegalArgumentException: The document is really a XLS file > at > org.apache.poi.poifs.filesystem.DirectoryNode.getEntry(DirectoryNode.java:322) > at > org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:82) > at > org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:74) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:155) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) > > What is the meaning of this exception? when it will be thrown? -- This message was sent by Atlassian Jira (v8.20.10#820010)