[jira] [Created] (TIKA-698) "Invalid UTF-16 surrogate detected:" parsing PowerPoint 97-2003

2011-08-26 Thread Pablo Queixalos (JIRA)
"Invalid UTF-16 surrogate detected:" parsing PowerPoint 97-2003 --- Key: TIKA-698 URL: https://issues.apache.org/jira/browse/TIKA-698 Project: Tika Issue Type: Bug Compone

[jira] [Updated] (TIKA-698) "Invalid UTF-16 surrogate detected:" parsing PowerPoint 97-2003

2011-08-26 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos updated TIKA-698: - Attachment: MS8.ppt > "Invalid UTF-16 surrogate detected:" parsing PowerPoint 97-2003 > ---

[jira] [Created] (TIKA-706) NPE Parsing MS PowerPoint 97-2003

2011-09-06 Thread Pablo Queixalos (JIRA)
NPE Parsing MS PowerPoint 97-2003 - Key: TIKA-706 URL: https://issues.apache.org/jira/browse/TIKA-706 Project: Tika Issue Type: Bug Components: parser Affects Versions: 0.1-incubating

[jira] [Created] (TIKA-707) IllegalArgumentException Parsing MS Word 97 - 2003

2011-09-06 Thread Pablo Queixalos (JIRA)
IllegalArgumentException Parsing MS Word 97 - 2003 -- Key: TIKA-707 URL: https://issues.apache.org/jira/browse/TIKA-707 Project: Tika Issue Type: Bug Components: parser Affects Ve

[jira] [Updated] (TIKA-706) NPE Parsing MS PowerPoint 97-2003

2011-09-06 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos updated TIKA-706: - Affects Version/s: (was: 0.1-incubating) 1.0 > NPE Parsing MS PowerPoint

[jira] [Updated] (TIKA-707) IllegalArgumentException Parsing MS Word 97 - 2003

2011-09-06 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos updated TIKA-707: - Affects Version/s: (was: 0.1-incubating) 1.0 > IllegalArgumentException

[jira] [Created] (TIKA-708) NPE Parsing MS Word 12.0.0

2011-09-06 Thread Pablo Queixalos (JIRA)
NPE Parsing MS Word 12.0.0 -- Key: TIKA-708 URL: https://issues.apache.org/jira/browse/TIKA-708 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.0 Reporter: Pablo Queixalo

[jira] [Updated] (TIKA-708) NPE Parsing MS Word 12.0.0

2011-09-06 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos updated TIKA-708: - Attachment: novos_estatutos.docx > NPE Parsing MS Word 12.0.0 > -- > >

[jira] [Commented] (TIKA-708) NPE Parsing MS Word 12.0.0

2011-09-06 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097972#comment-13097972 ] Pablo Queixalos commented on TIKA-708: -- Bug reported to POI issue tracker : https://is

[jira] [Commented] (TIKA-706) NPE Parsing MS PowerPoint 97-2003

2011-09-06 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097975#comment-13097975 ] Pablo Queixalos commented on TIKA-706: -- Bug reported in POI issue tracker : https://is

[jira] [Commented] (TIKA-707) IllegalArgumentException Parsing MS Word 97 - 2003

2011-09-06 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097979#comment-13097979 ] Pablo Queixalos commented on TIKA-707: -- Bug reported in POI issue tracker : https://is

[jira] [Commented] (TIKA-707) IllegalArgumentException Parsing MS Word 97 - 2003

2011-09-07 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098947#comment-13098947 ] Pablo Queixalos commented on TIKA-707: -- Seems to be fixed in POI (r1166144). > Illegal

[jira] [Commented] (TIKA-706) NPE Parsing MS PowerPoint 97-2003

2011-09-12 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102532#comment-13102532 ] Pablo Queixalos commented on TIKA-706: -- Seems to be fixed in POI (r1169679). > NPE Pa

[jira] [Commented] (TIKA-708) NPE Parsing MS Word 12.0.0

2011-09-12 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102539#comment-13102539 ] Pablo Queixalos commented on TIKA-708: -- Seems to be fixed in POI (r1169679) > NPE Pars

[jira] [Resolved] (TIKA-708) NPE Parsing MS Word 12.0.0

2011-09-20 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos resolved TIKA-708. -- Resolution: Fixed Fix Version/s: 0.10 Tested with POI pre-beta5 (trunk) > NPE Parsing MS

[jira] [Resolved] (TIKA-707) IllegalArgumentException Parsing MS Word 97 - 2003

2011-09-20 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos resolved TIKA-707. -- Resolution: Fixed Fix Version/s: 0.10 Tested with POI pre-beta5 (trunk) > IllegalArgument

[jira] [Resolved] (TIKA-706) NPE Parsing MS PowerPoint 97-2003

2011-09-20 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos resolved TIKA-706. -- Resolution: Fixed Fix Version/s: 0.10 Tested with POI pre-beta5 (trunk) > NPE Parsing MS

[jira] [Created] (TIKA-727) Improve the outputed XHTML by HSLFExtractor

2011-09-22 Thread Pablo Queixalos (JIRA)
Improve the outputed XHTML by HSLFExtractor --- Key: TIKA-727 URL: https://issues.apache.org/jira/browse/TIKA-727 Project: Tika Issue Type: Improvement Components: parser Affects Versions

[jira] [Updated] (TIKA-727) Improve the outputed XHTML by HSLFExtractor

2011-09-22 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos updated TIKA-727: - Attachment: HSLFExtractor.java Parser implementation based on what the POI PowerPointExtractor does

[jira] [Issue Comment Edited] (TIKA-727) Improve the outputed XHTML by HSLFExtractor

2011-09-22 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112496#comment-13112496 ] Pablo Queixalos edited comment on TIKA-727 at 9/22/11 12:02 PM: --

[jira] [Commented] (TIKA-727) Improve the outputed XHTML by HSLFExtractor

2011-09-22 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112496#comment-13112496 ] Pablo Queixalos commented on TIKA-727: -- Great ! (i) The non-breaking-space entities in

[jira] [Issue Comment Edited] (TIKA-727) Improve the outputed XHTML by HSLFExtractor

2011-09-22 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112496#comment-13112496 ] Pablo Queixalos edited comment on TIKA-727 at 9/22/11 12:12 PM: --

[jira] [Issue Comment Edited] (TIKA-727) Improve the outputed XHTML by HSLFExtractor

2011-09-22 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13112496#comment-13112496 ] Pablo Queixalos edited comment on TIKA-727 at 9/22/11 12:27 PM: --

[jira] [Created] (TIKA-731) NPE in WordExtractor.handleParagraph()

2011-09-26 Thread Pablo Queixalos (JIRA)
NPE in WordExtractor.handleParagraph() -- Key: TIKA-731 URL: https://issues.apache.org/jira/browse/TIKA-731 Project: Tika Issue Type: Bug Components: parser Affects Versions: 0.10

[jira] [Updated] (TIKA-731) NPE in WordExtractor.handleParagraph()

2011-09-26 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos updated TIKA-731: - Attachment: document_proposition_referencement.doc Attachment #2 > NPE in WordExtractor.handlePara

[jira] [Updated] (TIKA-731) NPE in WordExtractor.handleParagraph()

2011-09-26 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Queixalos updated TIKA-731: - Attachment: energie_nucleaire_france_fiche1.doc Throws NPE > NPE in WordExtractor.handleParagraph(

[jira] [Issue Comment Edited] (TIKA-731) NPE in WordExtractor.handleParagraph()

2011-09-26 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114567#comment-13114567 ] Pablo Queixalos edited comment on TIKA-731 at 9/26/11 9:57 AM: ---

[jira] [Commented] (TIKA-727) Improve the outputed XHTML by HSLFExtractor

2011-09-26 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114621#comment-13114621 ] Pablo Queixalos commented on TIKA-727: -- +1 on Jukka's comment. > Improve the outputed

[jira] [Commented] (TIKA-727) Improve the outputed XHTML by HSLFExtractor

2011-09-26 Thread Pablo Queixalos (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114663#comment-13114663 ] Pablo Queixalos commented on TIKA-727: -- bq. Looking at the html, there are still some n