[jira] [Commented] (TIKA-1152) Process loops infinitely on parsing of a CHM file
[ https://issues.apache.org/jira/browse/TIKA-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857418#comment-13857418 ] Hong-Thai Nguyen commented on TIKA-1152: Thank [~jukkaz], I've checked on trunk. Seems ok now. Process loops infinitely on parsing of a CHM file - Key: TIKA-1152 URL: https://issues.apache.org/jira/browse/TIKA-1152 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.4 Environment: Windows/Linux Reporter: Hong-Thai Nguyen Assignee: Jukka Zitting Priority: Critical Fix For: 1.5 Attachments: ChmLzxBlock.java.patch, eventcombmt.chm By parsing [the attachment CHM file|^eventcombmt.chm] (MS Microsoft Help Files), Java process stuck. {code} Thread[main,5,main] org.apache.tika.parser.chm.lzx.ChmLzxBlock.extractContent(ChmLzxBlock.java:203) org.apache.tika.parser.chm.lzx.ChmLzxBlock.init(ChmLzxBlock.java:77) org.apache.tika.parser.chm.core.ChmExtractor.extractChmEntry(ChmExtractor.java:338) org.apache.tika.parser.chm.CHMDocumentInformation.getContent(CHMDocumentInformation.java:72) org.apache.tika.parser.chm.CHMDocumentInformation.getText(CHMDocumentInformation.java:141) org.apache.tika.parser.chm.CHM2XHTML.process(CHM2XHTML.java:34) org.apache.tika.parser.chm.ChmParser.parse(ChmParser.java:51) org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) org.apache.tika.parser.AbstractParser.parse(AbstractParser.java:53) com.polyspot.document.converter.DocumentConverter.realizeConversion(DocumentConverter.java:192) ... {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (TIKA-1152) Process loops infinitely on parsing of a CHM file
[ https://issues.apache.org/jira/browse/TIKA-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855528#comment-13855528 ] Hong-Thai Nguyen commented on TIKA-1152: [~gagravarr] or anyone can have look at patch in integrate to trunk before release 1.5 please ? Merci Process loops infinitely on parsing of a CHM file - Key: TIKA-1152 URL: https://issues.apache.org/jira/browse/TIKA-1152 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.4 Environment: Windows/Linux Reporter: Hong-Thai Nguyen Priority: Critical Fix For: 1.5 Attachments: ChmLzxBlock.java.patch, eventcombmt.chm By parsing [the attachment CHM file|^eventcombmt.chm] (MS Microsoft Help Files), Java process stuck. {code} Thread[main,5,main] org.apache.tika.parser.chm.lzx.ChmLzxBlock.extractContent(ChmLzxBlock.java:203) org.apache.tika.parser.chm.lzx.ChmLzxBlock.init(ChmLzxBlock.java:77) org.apache.tika.parser.chm.core.ChmExtractor.extractChmEntry(ChmExtractor.java:338) org.apache.tika.parser.chm.CHMDocumentInformation.getContent(CHMDocumentInformation.java:72) org.apache.tika.parser.chm.CHMDocumentInformation.getText(CHMDocumentInformation.java:141) org.apache.tika.parser.chm.CHM2XHTML.process(CHM2XHTML.java:34) org.apache.tika.parser.chm.ChmParser.parse(ChmParser.java:51) org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) org.apache.tika.parser.AbstractParser.parse(AbstractParser.java:53) com.polyspot.document.converter.DocumentConverter.realizeConversion(DocumentConverter.java:192) ... {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (TIKA-1152) Process loops infinitely on parsing of a CHM file
[ https://issues.apache.org/jira/browse/TIKA-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716534#comment-13716534 ] Nick Burch commented on TIKA-1152: -- Can other tools parse the file without error? (I'm wondering if it's a bug in our CHM processing, or a faulty file) Process loops infinitely on parsing of a CHM file - Key: TIKA-1152 URL: https://issues.apache.org/jira/browse/TIKA-1152 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.4 Environment: Windows/Linux Reporter: Hong-Thai Nguyen Priority: Critical Fix For: 1.5 Attachments: eventcombmt.chm By parsing [the attachment CHM file|^eventcombmt.chm] (MS Microsoft Help Files), Java process stuck. {code} Thread[main,5,main] org.apache.tika.parser.chm.lzx.ChmLzxBlock.extractContent(ChmLzxBlock.java:203) org.apache.tika.parser.chm.lzx.ChmLzxBlock.init(ChmLzxBlock.java:77) org.apache.tika.parser.chm.core.ChmExtractor.extractChmEntry(ChmExtractor.java:338) org.apache.tika.parser.chm.CHMDocumentInformation.getContent(CHMDocumentInformation.java:72) org.apache.tika.parser.chm.CHMDocumentInformation.getText(CHMDocumentInformation.java:141) org.apache.tika.parser.chm.CHM2XHTML.process(CHM2XHTML.java:34) org.apache.tika.parser.chm.ChmParser.parse(ChmParser.java:51) org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) org.apache.tika.parser.AbstractParser.parse(AbstractParser.java:53) com.polyspot.document.converter.DocumentConverter.realizeConversion(DocumentConverter.java:192) ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (TIKA-1152) Process loops infinitely on parsing of a CHM file
[ https://issues.apache.org/jira/browse/TIKA-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716538#comment-13716538 ] Hong-Thai Nguyen commented on TIKA-1152: It's a bug on ChmLzxBlock.java on this fautly file. I'm ready to push a fix, but don't have ASF account on Tika project. Process loops infinitely on parsing of a CHM file - Key: TIKA-1152 URL: https://issues.apache.org/jira/browse/TIKA-1152 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.4 Environment: Windows/Linux Reporter: Hong-Thai Nguyen Priority: Critical Fix For: 1.5 Attachments: eventcombmt.chm By parsing [the attachment CHM file|^eventcombmt.chm] (MS Microsoft Help Files), Java process stuck. {code} Thread[main,5,main] org.apache.tika.parser.chm.lzx.ChmLzxBlock.extractContent(ChmLzxBlock.java:203) org.apache.tika.parser.chm.lzx.ChmLzxBlock.init(ChmLzxBlock.java:77) org.apache.tika.parser.chm.core.ChmExtractor.extractChmEntry(ChmExtractor.java:338) org.apache.tika.parser.chm.CHMDocumentInformation.getContent(CHMDocumentInformation.java:72) org.apache.tika.parser.chm.CHMDocumentInformation.getText(CHMDocumentInformation.java:141) org.apache.tika.parser.chm.CHM2XHTML.process(CHM2XHTML.java:34) org.apache.tika.parser.chm.ChmParser.parse(ChmParser.java:51) org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) org.apache.tika.parser.AbstractParser.parse(AbstractParser.java:53) com.polyspot.document.converter.DocumentConverter.realizeConversion(DocumentConverter.java:192) ... {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira