[jira] [Commented] (TIKA-818) Allow PDFBox to be used with RandomAccessFile vs RandomAccessBuffer to allow for a memory vs performance tradeoff

2012-01-24 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192072#comment-13192072 ] Nick Burch commented on TIKA-818: - Are you sure the scratchFile should be the real file

[jira] [Commented] (TIKA-849) Identify and parse the Apple iBooks format

2012-01-24 Thread Andrew Jackson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192074#comment-13192074 ] Andrew Jackson commented on TIKA-849: - I'm not that familiar with the content handling

[jira] [Issue Comment Edited] (TIKA-849) Identify and parse the Apple iBooks format

2012-01-24 Thread Andrew Jackson (Issue Comment Edited) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192078#comment-13192078 ] Andrew Jackson edited comment on TIKA-849 at 1/24/12 11:54 AM: ---

[jira] [Commented] (TIKA-849) Identify and parse the Apple iBooks format

2012-01-24 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192079#comment-13192079 ] Nick Burch commented on TIKA-849: - We might be able to use the same handler, but it'd need

[jira] [Resolved] (TIKA-839) TikaException with testPPT.potm in Tika GUI / CLI

2012-01-24 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-839. - Resolution: Fixed Fix Version/s: 1.1 TikaException with testPPT.potm in Tika GUI / CLI

[jira] [Commented] (TIKA-839) TikaException with testPPT.potm in Tika GUI / CLI

2012-01-24 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192107#comment-13192107 ] Nick Burch commented on TIKA-839: - Thanks for this, applied r1235233.

[jira] [Created] (TIKA-850) Consistent way to supply document passwords to parsers

2012-01-24 Thread Nick Burch (Created) (JIRA)
Consistent way to supply document passwords to parsers -- Key: TIKA-850 URL: https://issues.apache.org/jira/browse/TIKA-850 Project: Tika Issue Type: Improvement Components:

[jira] [Commented] (TIKA-850) Consistent way to supply document passwords to parsers

2012-01-24 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192184#comment-13192184 ] Nick Burch commented on TIKA-850: - Does anyone have a feeling for if the password should be

[jira] [Resolved] (TIKA-802) NullPointerException when parsing iWork files

2012-01-24 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-802. - Resolution: Cannot Reproduce NullPointerException when parsing iWork files

[jira] [Commented] (TIKA-760) NPE XHTMLContentHandler in characters Method

2012-01-24 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192189#comment-13192189 ] Nick Burch commented on TIKA-760: - NPE check added in r1235284. NPE

[jira] [Resolved] (TIKA-760) NPE XHTMLContentHandler in characters Method

2012-01-24 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-760. - Resolution: Fixed Fix Version/s: 1.1 NPE XHTMLContentHandler in characters Method

[jira] [Resolved] (TIKA-643) tika hangs parsing doc file (attached)

2012-01-24 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-643. - Resolution: Fixed Fix Version/s: 1.0 I believe this was fixed in Tika 1.0, by a POI upgrade

[jira] [Resolved] (TIKA-616) ArrayIndexOutOfBoundsException from POI

2012-01-24 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-616. - Resolution: Fixed Fix Version/s: 1.0 I believe this was fixed in Tika 1.0 by a POI upgrade (it's

[jira] [Resolved] (TIKA-637) Need API to get list of embedded documents

2012-01-24 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-637. - Resolution: Not A Problem Closing as Not A Problem, as this is handled by supplying a recursing parser on

[jira] [Commented] (TIKA-675) PackageExtractor should track names of recursively nested resources

2012-01-24 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192200#comment-13192200 ] Nick Burch commented on TIKA-675: - We could probably do this with a wrapper parser, which

[jira] [Commented] (TIKA-241) Rar archive support

2012-01-24 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192203#comment-13192203 ] Nick Burch commented on TIKA-241: - Has there been any luck getting junrar into Maven Central

[jira] [Resolved] (TIKA-195) MSWORD: Tika ignores text from Pieces

2012-01-24 Thread Nick Burch (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-195. - Resolution: Later I believe that all text from Word files is now extracted, and has been for at least a

buildbot failure in ASF Buildbot on tika-trunk

2012-01-24 Thread buildbot
The Buildbot has detected a new failure on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/720 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp:

[jira] [Commented] (TIKA-770) New ODF metadata keys

2012-01-24 Thread Nick Burch (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192242#comment-13192242 ] Nick Burch commented on TIKA-770: - I've updated the three remaining ones in r1235321, along

buildbot success in ASF Buildbot on tika-trunk

2012-01-24 Thread buildbot
The Buildbot has detected a restored build on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/721 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source