[jira] [Commented] (TIKA-901) Provide version number in tika-server

2012-04-27 Thread Ingo Renner (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263415#comment-13263415 ] Ingo Renner commented on TIKA-901: -- Thanks Chris for rocking even more. I'm just taking sma

[jira] [Commented] (TIKA-902) Problem with TimeZone when update poi to 3.8 from 3.6

2012-04-27 Thread Andrey Plotnikov (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263552#comment-13263552 ] Andrey Plotnikov commented on TIKA-902: --- Thanks > Problem with TimeZo

RE: [metadata] roadmap proposal available on the wiki

2012-04-27 Thread Joerg Ehrlich
Hi Antoni, > The roadmap doesn't give much detail about the intended vocabularies. > Dublin core is great, but what else? Joerg? What other kinds of metadata > information would you like to extract with Tika, and what vocabularies would > you like to use to express them? > > At Adobe, you'll li

RE: [metadata] roadmap proposal available on the wiki

2012-04-27 Thread Joerg Ehrlich
+1 This does indeed look like a good combination. Jörg -Original Message- From: Mattmann, Chris A (388J) [mailto:chris.a.mattm...@jpl.nasa.gov] Sent: Freitag, 27. April 2012 01:33 To: Subject: Re: [metadata] roadmap proposal available on the wiki Hi Antoni, Precisely! :) That would be

[jira] [Resolved] (TIKA-861) Parse links in PDF

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-861. - Resolution: Fixed Thanks, patches committed in r1331434. One thing to note is that links are extracted for

Build failed in Jenkins: Tika-trunk #838

2012-04-27 Thread Apache Jenkins Server
See Changes: [nick] TIKA-861 Patch from Ryan Quam to enable extracting PDF Links. (Links are extracted for now at the end of the page, further work will be needed to match them to the text they apply to) ---

Re: Build failed in Jenkins: Tika-trunk #838

2012-04-27 Thread Jukka Zitting
Hi, On Fri, Apr 27, 2012 at 4:48 PM, Apache Jenkins Server wrote: > message : Failed to execute goal org.apache.rat:apache-rat-plugin:0.7:check > (default) on project tika-server: Too many unapproved licenses: 1 This is still caused by the extra pom file written by the shade plugin in tika-serve

Re: Build failed in Jenkins: Tika-trunk #838

2012-04-27 Thread Mattmann, Chris A (388J)
Hey Jukka, In r1331457, I disabled tika-server build from the pom which should make Jenkins happy for now. I have no clue how to fix the shade plugin since I didn't do it before :), but if no one fixes it by next week (early), I'll research, investigate and address the problem. Take care dood.

[jira] [Created] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Gabriel Valencia (JIRA)
Gabriel Valencia created TIKA-903: - Summary: NPE thrown with password protected Pages file Key: TIKA-903 URL: https://issues.apache.org/jira/browse/TIKA-903 Project: Tika Issue Type: Bug

[jira] [Updated] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-903: -- Attachment: testPagesVariousPwdProtected.pages Password for this file is: tika >

[jira] [Updated] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-903: -- Attachment: (was: testPagesVariousPwdProtected.pages) > NPE thrown with password protecte

[jira] [Updated] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-903: -- Attachment: testPagesVariousPwdProtected.pages Password for this file is: tika >

[jira] [Commented] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263784#comment-13263784 ] Nick Burch commented on TIKA-903: - This file certainly isn't protected using regular zip pas

[jira] [Created] (TIKA-904) Pages documents created in Layout mode not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
Gabriel Valencia created TIKA-904: - Summary: Pages documents created in Layout mode not supported Key: TIKA-904 URL: https://issues.apache.org/jira/browse/TIKA-904 Project: Tika Issue Type: B

[jira] [Updated] (TIKA-904) Pages documents created in Layout mode not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-904: -- Attachment: testPagesCanvasJIRA.pages Sample Layout editing mode document > Page

Jenkins build is back to normal : Tika-trunk #839

2012-04-27 Thread Apache Jenkins Server
See

[jira] [Created] (TIKA-905) Embedded text boxes and shapes with text not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
Gabriel Valencia created TIKA-905: - Summary: Embedded text boxes and shapes with text not supported Key: TIKA-905 URL: https://issues.apache.org/jira/browse/TIKA-905 Project: Tika Issue Type:

[jira] [Updated] (TIKA-905) Embedded text boxes and shapes with text not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-905: -- Attachment: testPagesEmbeddedJIRA.pages Contains various embedded objects including text boxes an

[jira] [Updated] (TIKA-904) Pages documents created in Layout mode not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-904: -- Labels: iwork (was: ) > Pages documents created in Layout mode not supported > -

[jira] [Updated] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-903: -- Labels: iwork (was: ) > NPE thrown with password protected Pages file >

[jira] [Commented] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263798#comment-13263798 ] Nick Burch commented on TIKA-903: - As of r1331503 these should no longer break. The iWorks p

[jira] [Updated] (TIKA-905) Embedded text boxes and shapes with text not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-905: -- Labels: iwork (was: ) > Embedded text boxes and shapes with text not supported > ---

[jira] [Updated] (TIKA-906) Headers, footers, and footnotes not extracted from Pages documents

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-906: -- Attachment: testPagesHeadersFootersFootnotesJIRA.pages Contains header text, footer text (includi

[jira] [Created] (TIKA-906) Headers, footers, and footnotes not extracted from Pages documents

2012-04-27 Thread Gabriel Valencia (JIRA)
Gabriel Valencia created TIKA-906: - Summary: Headers, footers, and footnotes not extracted from Pages documents Key: TIKA-906 URL: https://issues.apache.org/jira/browse/TIKA-906 Project: Tika

[jira] [Updated] (TIKA-906) Headers, footers, and footnotes not extracted from Pages documents

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-906: -- Issue Type: Improvement (was: Bug) > Headers, footers, and footnotes not extracted from Page

[jira] [Updated] (TIKA-905) Embedded text boxes and shapes with text not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-905: -- Issue Type: Improvement (was: Bug) I'm new to JIRA, so please change if I'm wrong. I figure this

[jira] [Updated] (TIKA-904) Pages documents created in Layout mode not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-904: -- Issue Type: Improvement (was: Bug) > Pages documents created in Layout mode not supported >

[jira] [Created] (TIKA-907) Comments embedded in Pages documents not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
Gabriel Valencia created TIKA-907: - Summary: Comments embedded in Pages documents not supported Key: TIKA-907 URL: https://issues.apache.org/jira/browse/TIKA-907 Project: Tika Issue Type: Imp

[jira] [Updated] (TIKA-907) Comments embedded in Pages documents not supported

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriel Valencia updated TIKA-907: -- Attachment: testPagesShareiWorkJIRA.pages testPagesCommentsJIRA.pages Pages docum

Build failed in Jenkins: Tika-trunk #840

2012-04-27 Thread Apache Jenkins Server
See Changes: [nick] TIKA-903 Avoid breaking on Password Protected iWorks files. We can't parse them yet though, as we don't know how the encryption works -- [...truncated 360 lines...] Tests run: 8, F

[jira] [Commented] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264008#comment-13264008 ] Nick Burch commented on TIKA-903: - On a related note, it's not looking hopeful for getting a

[jira] [Commented] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Gabriel Valencia (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264018#comment-13264018 ] Gabriel Valencia commented on TIKA-903: --- Yes, TIKA-402 has a URL to a page that *used*

[jira] [Commented] (TIKA-906) Headers, footers, and footnotes not extracted from Pages documents

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264059#comment-13264059 ] Nick Burch commented on TIKA-906: - Support added in r1331618. We can now get headers, footer

[jira] [Resolved] (TIKA-906) Headers, footers, and footnotes not extracted from Pages documents

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-906. - Resolution: Fixed Fix Version/s: 1.2 > Headers, footers, and footnotes not extracted from Pages

[jira] [Commented] (TIKA-903) NPE thrown with password protected Pages file

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264067#comment-13264067 ] Nick Burch commented on TIKA-903: - Ah, good spot. That page has gone, but if you view it in

[jira] [Commented] (TIKA-876) Signed pdf parsing

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264112#comment-13264112 ] Nick Burch commented on TIKA-876: - We still can't help you very much without a (small) sampl

[jira] [Commented] (TIKA-907) Comments embedded in Pages documents not supported

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264126#comment-13264126 ] Nick Burch commented on TIKA-907: - Support added in r1331640. We now collect the annotations

[jira] [Resolved] (TIKA-907) Comments embedded in Pages documents not supported

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-907. - Resolution: Fixed Fix Version/s: 1.2 > Comments embedded in Pages documents not supported >

[jira] [Commented] (TIKA-905) Embedded text boxes and shapes with text not supported

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264129#comment-13264129 ] Nick Burch commented on TIKA-905: - Are you able to identify where in the file these text box

[jira] [Commented] (TIKA-904) Pages documents created in Layout mode not supported

2012-04-27 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264133#comment-13264133 ] Nick Burch commented on TIKA-904: - Any chance you could compare two simple documents, one wi