[jira] [Created] (TIKA-1314) An inappropriate comment of CharsetDetector.detect()

2014-05-29 Thread Yi EungJun (JIRA)
Yi EungJun created TIKA-1314: Summary: An inappropriate comment of CharsetDetector.detect() Key: TIKA-1314 URL: https://issues.apache.org/jira/browse/TIKA-1314 Project: Tika Issue Type: Bug

Re: [jira] [Commented] (TIKA-93) OCR support

2014-05-29 Thread Oleg Tikhonov
Guys, Tesseract is by itself a project that written on C/C++ and should be compiled differently for each platform. Personally, i would put a requirement for those who want to work with tesseract. Not sure that putting Tesseract in the sources is a right way to go. >>How good tesseract is - depend

[jira] [Commented] (TIKA-93) OCR support

2014-05-29 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012810#comment-14012810 ] Luis Filipe Nassif commented on TIKA-93: Thank you very much [~tpalsulich] for includ

[jira] [Updated] (TIKA-93) OCR support

2014-05-29 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Palsulich updated TIKA-93: Attachment: TesseractOCR_Tyler.patch Awesome! I attached another patch which includes TesseractOCRPars

[jira] [Commented] (TIKA-1313) XSL-FO detection

2014-05-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012626#comment-14012626 ] Hudson commented on TIKA-1313: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #10 (See [https://bu

[jira] [Commented] (TIKA-1312) FDF files detection

2014-05-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012625#comment-14012625 ] Hudson commented on TIKA-1312: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #10 (See [https://bu

[jira] [Commented] (TIKA-1313) XSL-FO detection

2014-05-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012584#comment-14012584 ] Hudson commented on TIKA-1313: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #10 (See [https://bu

[jira] [Commented] (TIKA-1312) FDF files detection

2014-05-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012583#comment-14012583 ] Hudson commented on TIKA-1312: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #10 (See [https://bu

[jira] [Commented] (TIKA-93) OCR support

2014-05-29 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012525#comment-14012525 ] Luis Filipe Nassif commented on TIKA-93: It was not intentional, the patch should hav

[jira] [Resolved] (TIKA-1312) FDF files detection

2014-05-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1312. -- Resolution: Fixed Fix Version/s: 1.6 Thanks for the patch, applied in r1598329. > FDF files dete

[jira] [Commented] (TIKA-1313) XSL-FO detection

2014-05-29 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012511#comment-14012511 ] Ken Krugler commented on TIKA-1313: --- Is there a test file you could provide that's valid

[jira] [Resolved] (TIKA-1313) XSL-FO detection

2014-05-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1313. -- Resolution: Fixed Fix Version/s: 1.6 Thanks for the patch, applied in r1598329. > XSL-FO detecti

[jira] [Commented] (TIKA-1294) Add ability to turn off extraction of PDXObjectImages (TIKA-1268) from PDFs

2014-05-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012493#comment-14012493 ] Hudson commented on TIKA-1294: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #9 (See [https://bui

[jira] [Created] (TIKA-1313) XSL-FO detection

2014-05-29 Thread Marco Quaranta (JIRA)
Marco Quaranta created TIKA-1313: Summary: XSL-FO detection Key: TIKA-1313 URL: https://issues.apache.org/jira/browse/TIKA-1313 Project: Tika Issue Type: Improvement Components: det

[jira] [Updated] (TIKA-1312) FDF files detection

2014-05-29 Thread Marco Quaranta (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Quaranta updated TIKA-1312: - Issue Type: Improvement (was: Bug) > FDF files detection > --- > >

[jira] [Created] (TIKA-1312) FDF files detection

2014-05-29 Thread Marco Quaranta (JIRA)
Marco Quaranta created TIKA-1312: Summary: FDF files detection Key: TIKA-1312 URL: https://issues.apache.org/jira/browse/TIKA-1312 Project: Tika Issue Type: Bug Components: detector

[jira] [Commented] (TIKA-1294) Add ability to turn off extraction of PDXObjectImages (TIKA-1268) from PDFs

2014-05-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012467#comment-14012467 ] Hudson commented on TIKA-1294: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #9 (See [https://bui

[jira] [Commented] (TIKA-93) OCR support

2014-05-29 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012452#comment-14012452 ] Tyler Palsulich commented on TIKA-93: - Thanks for the help! I applied the patch. But, the

Re: Hello

2014-05-29 Thread Tyler Palsulich
Thanks, Tim! I'm more of an IntelliJ guy myself. IDEA has a feature where you can check out a project directly from Subversion, which works pretty well. The `mvn test -DfailIfNoTests=false -Dtest=org.apache.tika.{...}` command is very helpful with testing. :) Is there a good way to run the current

[jira] [Commented] (TIKA-1294) Add ability to turn off extraction of PDXObjectImages (TIKA-1268) from PDFs

2014-05-29 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012403#comment-14012403 ] Tim Allison commented on TIKA-1294: --- Doh! Thank you. Mods in r1598305. > Add ability to

[jira] [Commented] (TIKA-1294) Add ability to turn off extraction of PDXObjectImages (TIKA-1268) from PDFs

2014-05-29 Thread Ray Gauss II (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012393#comment-14012393 ] Ray Gauss II commented on TIKA-1294: Hi [~talli...@apache.org], The changes look good,

[jira] [Commented] (TIKA-1204) DWFX files detection

2014-05-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012389#comment-14012389 ] Nick Burch commented on TIKA-1204: -- Really we need a file that either you yourself produce

[jira] [Commented] (TIKA-1204) DWFX files detection

2014-05-29 Thread Marco Quaranta (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012373#comment-14012373 ] Marco Quaranta commented on TIKA-1204: -- I am sorry for replying late, I found a sample

[jira] [Commented] (TIKA-241) Rar archive support

2014-05-29 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012297#comment-14012297 ] Nick Burch commented on TIKA-241: - I'd been hoping Jukka would, sorry, have just done so now

RE: [DISCUSS] Centralizing JSON handling of Metadata

2014-05-29 Thread Nick Burch
On Wed, 28 May 2014, Ray Gauss II wrote: However, that sort of modularization is probably a broader discussion than what we need for this particular issue, so between those two I’d vote for tika-serialization. Tika-CLI and Tika-Server will likely want to depend on all of the serialisation met