[ 
https://issues.apache.org/jira/browse/TIKA-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362582#comment-14362582
 ] 

Tyler Palsulich commented on TIKA-1176:
---------------------------------------

That section of the Parser seems to be completely commented out. The attached 
file now causes an index out of bounds Exception (wrapped by a TikaException):
{code}
Exception in thread "main" org.apache.tika.exception.TikaException
        at 
org.apache.tika.parser.chm.core.ChmExtractor.extractChmEntry(ChmExtractor.java:360)
        at org.apache.tika.parser.chm.ChmParser.parse(ChmParser.java:79)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:270)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:270)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:153)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:450)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:123)
{code}

> ChmDirectoryListingSet does not correctly enumerate directory entries
> ---------------------------------------------------------------------
>
>                 Key: TIKA-1176
>                 URL: https://issues.apache.org/jira/browse/TIKA-1176
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.4
>            Reporter: Doug Martin
>         Attachments: HelpStudioSample.chm
>
>
> ChmDirectoryListingSet.enumerateOneSegment method does not correctly 
> enumerate directory entries when ChmCommons.indexOf returns -1 for work data 
> or user data.  Here is the offending code:
> {code}
>                 int indexWorkData = ChmCommons.indexOf(dir_chunk,
>                         "::".getBytes());
>                 int indexUserData = ChmCommons.indexOf(dir_chunk,
>                         "/".getBytes());
>                 if (indexUserData < indexWorkData)
>                     setPlaceHolder(indexUserData);
>                 else
>                     setPlaceHolder(indexWorkData);
>                 if (getPlaceHolder() > 0 ...
> {code}
> If either indexUserData or indexWorkData is -1, that value will be set as the 
> placeholder index, resulting in the method returning without processing any 
> entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to