[jira] [Resolved] (TIKA-1427) PDF Images don't appear in structured view

2014-10-01 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1427. --- Resolution: Fixed r1628707. Made inline image tags equivalent to those created by Word parser. Let

[jira] [Commented] (TIKA-1427) PDF Images don't appear in structured view

2014-10-01 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154876#comment-14154876 ] Hudson commented on TIKA-1427: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #241 (See

[jira] [Commented] (TIKA-1427) PDF Images don't appear in structured view

2014-10-01 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154898#comment-14154898 ] Hudson commented on TIKA-1427: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #219 (See

Re: OCR with tika-server

2014-10-01 Thread kevin slote
What I wrote there did have a typo in it. (It's not every day you get to embarrass yourself in front of a bunch of guys from NASA) But that was not what I had in my terminal when I tested it. The actual PATH was:

Re: OCR with tika-server

2014-10-01 Thread Tyler Palsulich
Hmm. Can you run Tesseract on a simple png file? This example doesn't work very well for OCR... But, for the sake of example: $ sudo apt-get install tesseract-ocr $ java -jar tika-server/target/tika-server-1.7-SNAPSHOT.jar // New terminal. // Grab the Google logo. $ curl -O

Re: OCR with tika-server

2014-10-01 Thread Ramirez, Paul M (398J)
Nothing to be embarrassed about at all Kevin. I actually thought maybe it was just a typo issue and I randomly happen to catch that. I've definitely done that one before myself. Bummed that was not the problem. --Paul On Oct 1, 2014, at 1:00 PM, kevin slote kslo...@gmail.com wrote: What

Re: OCR with tika-server

2014-10-01 Thread Mattmann, Chris A (3980)
What type of image is it, Kevin? If it’s a TIFF, you need to install tesseract with special lib tiff parameters. See: https://gist.github.com/henrik/1967035 Can you parse the image file with tesseract by itself, without Tika’s tmp image?

[jira] [Updated] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-01 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann updated TIKA-1422: Attachment: TIKA-1422.Mattmann.100114.patch.txt - patch gets tests passing and tries to add

[jira] [Updated] (TIKA-1423) Build a parser to extract data from GRIB formats

2014-10-01 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated TIKA-1423: --- Attachment: NLDAS_FORA0125_H.A20130112.1200.002.grb Here you go [~vinegh] Build a