[jira] [Commented] (TIKA-711) Word parser doesn't extract optional hyphen correctly

2011-10-01 Thread Michael McCandless (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118905#comment-13118905 ] Michael McCandless commented on TIKA-711: - Curiously, if I use POI's WordToTextConve

[HEADS UP] Added Tika ApacheCon NA 2011 news item

2011-10-01 Thread Mattmann, Chris A (388J)
Hey Guys, I updated the TIka website to mention the ApacheCon NA 2011 talk I'm giving. Just a heads up, thanks! Cheers, Chris ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA

[jira] [Commented] (TIKA-735) OpenOffice parser: embedded OLE docs are extracted at the end, as extra ...

2011-10-01 Thread Jukka Zitting (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118861#comment-13118861 ] Jukka Zitting commented on TIKA-735: A parser should always produce valid XHTML output.

[RESULT] [VOTE] Add Any23 to the Apache Incubator

2011-10-01 Thread Mattmann, Chris A (388J)
Hi Everyone, This VOTE has passed with the following tallies: +1 IPMC (binding) Chris Mattmann Christian Grobmeier Tommaso Teofili Julien Nioche Jukka Zitting Bertrand Delacreatz Nick Kew Olivier Lamy Paul Ramirez +1 Community Lewis John McGibbney Raffaele P. Guidi Andy Seaborne Michele Mostar

[jira] [Resolved] (TIKA-720) EBCDIC encoding not detected

2011-10-01 Thread Michael McCandless (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved TIKA-720. - Resolution: Fixed Fix Version/s: 0.10 I think this was fixed in 0.10?

[jira] [Commented] (TIKA-737) Use (Incubating) ODFToolkit to improve ODF file format processing

2011-10-01 Thread Michael McCandless (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118834#comment-13118834 ] Michael McCandless commented on TIKA-737: - +1, sounds great! > Use

[jira] [Commented] (TIKA-735) OpenOffice parser: embedded OLE docs are extracted at the end, as extra ...

2011-10-01 Thread Michael McCandless (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118833#comment-13118833 ] Michael McCandless commented on TIKA-735: - Ahhh, I see. So it looks like our defaul

[jira] [Commented] (TIKA-736) OpenOffice parser: master footer text isn't extracted

2011-10-01 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118828#comment-13118828 ] Nick Burch commented on TIKA-736: - Looking at our current parser, we don't touch the styles

[jira] [Created] (TIKA-737) Use (Incubating) ODFToolkit to improve ODF file format processing

2011-10-01 Thread Nick Burch (Created) (JIRA)
Use (Incubating) ODFToolkit to improve ODF file format processing - Key: TIKA-737 URL: https://issues.apache.org/jira/browse/TIKA-737 Project: Tika Issue Type: Improvement

[jira] [Commented] (TIKA-736) OpenOffice parser: master footer text isn't extracted

2011-10-01 Thread Michael McCandless (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118825#comment-13118825 ] Michael McCandless commented on TIKA-736: - OK that makes sense; hopefully it's not t

Re: Jenkins build became unstable: Tika-trunk » Apache Tika parsers #657

2011-10-01 Thread Michael McCandless
Sorry, this was my bad -- failed to add the test file for the new test case. Should be fixed now... Mike McCandless http://blog.mikemccandless.com On Sat, Oct 1, 2011 at 7:07 AM, Apache Jenkins Server wrote: > See >

[jira] [Commented] (TIKA-736) OpenOffice parser: master footer text isn't extracted

2011-10-01 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118783#comment-13118783 ] Nick Burch commented on TIKA-736: - It's probably not worth putting too much work into our Op

[jira] [Commented] (TIKA-735) OpenOffice parser: embedded OLE docs are extracted at the end, as extra ...

2011-10-01 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118782#comment-13118782 ] Nick Burch commented on TIKA-735: - I think this is a Tika CLI issue, rather than a Parser on

Jenkins build is back to stable : Tika-trunk » Apache Tika parsers #658

2011-10-01 Thread Apache Jenkins Server
See

Jenkins build is back to stable : Tika-trunk #658

2011-10-01 Thread Apache Jenkins Server
See

Jenkins build became unstable: Tika-trunk #657

2011-10-01 Thread Apache Jenkins Server
See

buildbot success in ASF Buildbot on tika-trunk

2011-10-01 Thread buildbot
The Buildbot has detected a restored build on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/531 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp

[jira] [Resolved] (TIKA-632) Rtf parsing ignores links

2011-10-01 Thread Michael McCandless (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved TIKA-632. - Resolution: Fixed Fix Version/s: 1.0 > Rtf parsing ignores links > -

buildbot failure in ASF Buildbot on tika-trunk

2011-10-01 Thread buildbot
The Buildbot has detected a new failure on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/530 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [

[jira] [Updated] (TIKA-736) OpenOffice parser: master footer text isn't extracted

2011-10-01 Thread Michael McCandless (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated TIKA-736: Attachment: TIKA-736.patch testMasterFooter.odp Patch with failing test case.

[jira] [Created] (TIKA-736) OpenOffice parser: master footer text isn't extracted

2011-10-01 Thread Michael McCandless (Created) (JIRA)
OpenOffice parser: master footer text isn't extracted - Key: TIKA-736 URL: https://issues.apache.org/jira/browse/TIKA-736 Project: Tika Issue Type: Bug Components: parser

[jira] [Updated] (TIKA-735) OpenOffice parser: embedded OLE docs are extracted at the end, as extra ...

2011-10-01 Thread Michael McCandless (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated TIKA-735: Attachment: embeddedText.odp ODP document that leads to above text output from TikaCLI -x.

[jira] [Created] (TIKA-735) OpenOffice parser: embedded OLE docs are extracted at the end, as extra ...

2011-10-01 Thread Michael McCandless (Created) (JIRA)
OpenOffice parser: embedded OLE docs are extracted at the end, as extra ... Key: TIKA-735 URL: https://issues.apache.org/jira/browse/TIKA-735 Project: Tika