[ https://issues.apache.org/jira/browse/TIKA-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273681#comment-14273681 ]
Nick Burch commented on TIKA-1512: ---------------------------------- What about subsequent runs - I'm wondering where the closing quote is for the hyperlink? Also, does the text contain the whole of the URL, or is it truncated at all? If you open the file in Word and do a save-as, does the file then parse properly, or does the problem remain? If you open the file in Word, does the hyperlink work properly in Word? > WordParser fails on many Word files > ----------------------------------- > > Key: TIKA-1512 > URL: https://issues.apache.org/jira/browse/TIKA-1512 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.5, 1.6, 1.7, 1.8 > Environment: Linux 64bit > OpenJDK Runtime Environment (IcedTea 2.4.4) (suse-24.13.5-x86_64) > OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode) > and > java version "1.6.0" > Java(TM) SE Runtime Environment > IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 (JIT enabled, AOT > enabled) > Reporter: F Seid > Assignee: Jukka Zitting > > WordParser fail on some word files. A negative value is sent to substring -- This message was sent by Atlassian JIRA (v6.3.4#6332)