[jira] [Resolved] (TIKA-434) Bug in TagSoup causes IOException

2011-07-04 Thread Maxim Valyanskiy (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Valyanskiy resolved TIKA-434. --- Resolution: Fixed Fix Version/s: 1.0 Assignee: Maxim Valyanskiy > Bug in TagSou

Re: svn commit: r1142632 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/html/HtmlParser.java test/java/org/apache/tika/parser/html/HtmlParserTest.java test/resources/test-document

2011-07-04 Thread Jukka Zitting
Hi, On Mon, Jul 4, 2011 at 1:43 PM, wrote: > +                    @Override > +                    public void scan(Reader r0, ScanHandler h) throws > IOException, SAXException { > +                        super.scan(new PushbackReader(new > BufferedReader(r0), 2), h); > +                    }

[jira] [Commented] (TIKA-679) Proposal for PRT Parser

2011-07-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059601#comment-13059601 ] Nick Burch commented on TIKA-679: - I've added the detector part in r1142795, thanks for the

[jira] [Commented] (TIKA-679) Proposal for PRT Parser

2011-07-04 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059623#comment-13059623 ] Nick Burch commented on TIKA-679: - I've committed a first stab at a PRT parser in r1142817,

Build failed in Jenkins: Tika-trunk » Apache Tika parsers #573

2011-07-04 Thread Apache Jenkins Server
See Changes: [nick] TIKA-679 CADKey PRT parser -- [JENKINS] Archiving /home/hudson/hudson-slave/workspace/Tika-trunk/trunk/tika-core/pom.xml to /home/hudson/hudson/jobs/

Build failed in Jenkins: Tika-trunk #573

2011-07-04 Thread Apache Jenkins Server
See Changes: [nick] TIKA-679 CADKey PRT parser -- [...truncated 98 lines...] [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 21 resources [INFO] Copying 3 resources [TASKS] Ski

Build failed in Jenkins: Tika-trunk » Apache Tika parsers #574

2011-07-04 Thread Apache Jenkins Server
See Changes: [nick] TIKA-679 Remove un-used import, and fix warning -- [JENKINS] Archiving /home/hudson/hudson-slave/workspace/Tika-trunk/trunk/tika-core/pom.xml to /hom

Build failed in Jenkins: Tika-trunk #574

2011-07-04 Thread Apache Jenkins Server
See Changes: [nick] TIKA-679 Remove un-used import, and fix warning -- [...truncated 1066 lines...] [TASKS] Found 0. [TASKS] Scanning folder '

[jira] [Issue Comment Edited] (TIKA-679) Proposal for PRT Parser

2011-07-04 Thread Troy Witthoeft (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059639#comment-13059639 ] Troy Witthoeft edited comment on TIKA-679 at 7/4/11 11:49 PM: --

[jira] [Commented] (TIKA-679) Proposal for PRT Parser

2011-07-04 Thread Troy Witthoeft (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059639#comment-13059639 ] Troy Witthoeft commented on TIKA-679: - I've narrowed the encoding down to CP437. CP437 c

[jira] [Commented] (TIKA-679) Proposal for PRT Parser

2011-07-04 Thread Troy Witthoeft (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059640#comment-13059640 ] Troy Witthoeft commented on TIKA-679: - Nick, Tomorrow, I will test your code with a lar