[jira] [Commented] (TIKA-1447) CHM parser: wrong directory list
[ https://issues.apache.org/jira/browse/TIKA-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221949#comment-14221949 ] Bin Hawking commented on TIKA-1447: --- Yes, it was meant to fix TIKA-1430, 1446, 1447, 1448. I had added a new test case to validate the extracted directory list. Please see the messages in TIKA-1446 and the code. CHM parser: wrong directory list Key: TIKA-1447 URL: https://issues.apache.org/jira/browse/TIKA-1447 Project: Tika Issue Type: Bug Affects Versions: 1.7 Reporter: Bin Hawking Priority: Critical CHM parser gets wrong directory list of a chm file (eg. testChm2.chm in tika-parser's test-resources): 1. Duplicate entries (mostly from PMGI chunks, which should have been ignored.) 2. Invalid entry (usually with unreadable entry name). 3. Missed entries (some times it is like TIKA-1176) I have fixed it (to some degree), by using the PMGL header to find dir chunks and their respective meaningful parts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TIKA-1447) CHM parser: wrong directory list
[ https://issues.apache.org/jira/browse/TIKA-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14214535#comment-14214535 ] Hong-Thai Nguyen commented on TIKA-1447: [~binhawking], The work on TIKA-1446 fixed this issue ? Any change to double check again ? Thanks, CHM parser: wrong directory list Key: TIKA-1447 URL: https://issues.apache.org/jira/browse/TIKA-1447 Project: Tika Issue Type: Bug Affects Versions: 1.7 Reporter: Bin Hawking Priority: Critical CHM parser gets wrong directory list of a chm file (eg. testChm2.chm in tika-parser's test-resources): 1. Duplicate entries (mostly from PMGI chunks, which should have been ignored.) 2. Invalid entry (usually with unreadable entry name). 3. Missed entries (some times it is like TIKA-1176) I have fixed it (to some degree), by using the PMGL header to find dir chunks and their respective meaningful parts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)