[jira] [Commented] (TIKA-1176) ChmDirectoryListingSet does not correctly enumerate directory entries

2014-10-13 Thread Hong-Thai Nguyen (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169146#comment-14169146
 ] 

Hong-Thai Nguyen commented on TIKA-1176:


Hi [~mdgeek], thank for your offering code  testing file. Unfortunately, this 
check raised other exception on this file:
{code}
The full exception stack trace is included below:

org.apache.tika.exception.TikaException
at 
org.apache.tika.parser.chm.core.ChmExtractor.extractChmEntry(ChmExtractor.java:355)
at org.apache.tika.parser.chm.ChmParser.parse(ChmParser.java:70)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.gui.TikaGUI.handleStream(TikaGUI.java:326)
at org.apache.tika.gui.TikaGUI.openFile(TikaGUI.java:285)
at 
org.apache.tika.gui.ParsingTransferHandler.importFiles(ParsingTransferHandler.java:94)
at 
org.apache.tika.gui.ParsingTransferHandler.importData(ParsingTransferHandler.java:77)
at javax.swing.TransferHandler.importData(TransferHandler.java:755)
at 
javax.swing.TransferHandler$DropHandler.drop(TransferHandler.java:1478)
at java.awt.dnd.DropTarget.drop(DropTarget.java:434)
at 
javax.swing.TransferHandler$SwingDropTarget.drop(TransferHandler.java:1203)
at 
sun.awt.dnd.SunDropTargetContextPeer.processDropMessage(SunDropTargetContextPeer.java:519)
at 
sun.awt.dnd.SunDropTargetContextPeer$EventDispatcher.dispatchDropEvent(SunDropTargetContextPeer.java:832)
at 
sun.awt.dnd.SunDropTargetContextPeer$EventDispatcher.dispatchEvent(SunDropTargetContextPeer.java:756)
at sun.awt.dnd.SunDropTargetEvent.dispatch(SunDropTargetEvent.java:30)
at java.awt.Component.dispatchEventImpl(Component.java:4517)
at java.awt.Container.dispatchEventImpl(Container.java:2097)
at java.awt.Component.dispatchEvent(Component.java:4488)
at 
java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4575)
at 
java.awt.LightweightDispatcher.processDropTargetEvent(Container.java:4310)
at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4161)
at java.awt.Container.dispatchEventImpl(Container.java:2083)
at java.awt.Window.dispatchEventImpl(Window.java:2489)
at java.awt.Component.dispatchEvent(Component.java:4488)
at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:674)
at java.awt.EventQueue.access$400(EventQueue.java:81)
at java.awt.EventQueue$2.run(EventQueue.java:633)
at java.awt.EventQueue$2.run(EventQueue.java:631)
at java.security.AccessController.doPrivileged(Native Method)
at 
java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:87)
at 
java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:98)
at java.awt.EventQueue$3.run(EventQueue.java:647)
at java.awt.EventQueue$3.run(EventQueue.java:645)
at java.security.AccessController.doPrivileged(Native Method)
at 
java.security.AccessControlContext$1.doIntersectionPrivilege(AccessControlContext.java:87)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:644)
at 
java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:269)
at 
java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:184)
at 
java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:174)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:169)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:161)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:122)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at 
org.apache.tika.parser.chm.core.ChmCommons.copyOfRange(ChmCommons.java:342)
at 
org.apache.tika.parser.chm.core.ChmCommons.getChmBlockSegment(ChmCommons.java:108)
at 
org.apache.tika.parser.chm.core.ChmExtractor.extractChmEntry(ChmExtractor.java:337)
... 43 more
{code} 

It's quite complex our CHM Parser, can you apply a full fix and a test with 
expected content in output on your file ?

Thanks,

 ChmDirectoryListingSet does not correctly enumerate directory entries
 -

 Key: TIKA-1176
 URL: https://issues.apache.org/jira/browse/TIKA-1176
 Project: Tika
  Issue Type: Bug
  Components: parser
Affects Versions: 1.4
Reporter: Doug Martin
 Attachments: HelpStudioSample.chm


 

[jira] [Commented] (TIKA-1176) ChmDirectoryListingSet does not correctly enumerate directory entries

2013-10-01 Thread Doug Martin (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783218#comment-13783218
 ] 

Doug Martin commented on TIKA-1176:
---

The following change fixes the problem:
{code}
if (indexUserData  indexWorkData || indexWorkData == -1) {
setPlaceHolder(indexUserData);
} else {
setPlaceHolder(indexWorkData);
}
{code}

 ChmDirectoryListingSet does not correctly enumerate directory entries
 -

 Key: TIKA-1176
 URL: https://issues.apache.org/jira/browse/TIKA-1176
 Project: Tika
  Issue Type: Bug
  Components: parser
Affects Versions: 1.4
Reporter: Doug Martin

 ChmDirectoryListingSet.enumerateOneSegment method does not correctly 
 enumerate directory entries when ChmCommons.indexOf returns -1 for work data 
 or user data.  Here is the offending code:
 {code}
 int indexWorkData = ChmCommons.indexOf(dir_chunk,
 ::.getBytes());
 int indexUserData = ChmCommons.indexOf(dir_chunk,
 /.getBytes());
 if (indexUserData  indexWorkData)
 setPlaceHolder(indexUserData);
 else
 setPlaceHolder(indexWorkData);
 if (getPlaceHolder()  0 ...
 {code}
 If either indexUserData or indexWorkData is -1, that value will be set as the 
 placeholder index, resulting in the method returning without processing any 
 entries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (TIKA-1176) ChmDirectoryListingSet does not correctly enumerate directory entries

2013-10-01 Thread Nick Burch (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783286#comment-13783286
 ] 

Nick Burch commented on TIKA-1176:
--

Any chance you could upload a small sample file that shows the problem? We 
could then use that in a unit test, to verify the fix, and also so ensure it 
stays fixed!

 ChmDirectoryListingSet does not correctly enumerate directory entries
 -

 Key: TIKA-1176
 URL: https://issues.apache.org/jira/browse/TIKA-1176
 Project: Tika
  Issue Type: Bug
  Components: parser
Affects Versions: 1.4
Reporter: Doug Martin

 ChmDirectoryListingSet.enumerateOneSegment method does not correctly 
 enumerate directory entries when ChmCommons.indexOf returns -1 for work data 
 or user data.  Here is the offending code:
 {code}
 int indexWorkData = ChmCommons.indexOf(dir_chunk,
 ::.getBytes());
 int indexUserData = ChmCommons.indexOf(dir_chunk,
 /.getBytes());
 if (indexUserData  indexWorkData)
 setPlaceHolder(indexUserData);
 else
 setPlaceHolder(indexWorkData);
 if (getPlaceHolder()  0 ...
 {code}
 If either indexUserData or indexWorkData is -1, that value will be set as the 
 placeholder index, resulting in the method returning without processing any 
 entries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)