Unfortunately, that did not do it either. I did:
$export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/urs/bin/tesseract Here is the output from printenv kslote@ubuntu:~/tika/tika$ printenv SHELL=/bin/bash USERNAME=kslote XDG_CONFIG_DIRS=/etc/xdg/xdg-gnome:/etc/xdg DESKTOP_SESSION=gnome PATH=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/urs/bin/tesseract PWD=/home/kslote/tika/tika HOME=/home/kslote LOGNAME=kslote _=/usr/bin/printenv On Tue, Sep 30, 2014 at 4:13 PM, Tyler Palsulich <tpalsul...@gmail.com> wrote: > Hi, > > Hmm. Could you try adding tesseract to your PATH? How did you install > Tesseract? You should be able to do a straightforward `sudo apt-get install > tesseract-ocr`. After that, the OCR tests should pass. We're still running > into TIKA-1422, where a mail test fails. But, you can run just the OCR > tests with `mvn test -Dtest=org.apache.tika.parser.ocr.TesseractOCRTest > -DfailIfNoTests=false`. > > Let me know if that works for you! > Tyler > > On Tue, Sep 30, 2014 at 4:00 PM, kevin slote <kslo...@gmail.com> wrote: > > > I am working on ubuntu 10.4. and I am having some trouble. > > Tesseract is installed correctly, but just doing a clone from the repo > and > > installing with maven, I am getting some errors. > > > > This is before I did anything with tesseract installed. > > > > Failed tests: testPPTXOCR(org.apache.tika.parser.ocr.TesseractOCRTest): > > Check for the image's text. > > testDOCXOCR(org.apache.tika.parser.ocr.TesseractOCRTest) > > testPDFOCR(org.apache.tika.parser.ocr.TesseractOCRTest) > > > > Next I hard coded the tesseractPath: > > > > I went into the TesseractOCRConfig.java and hard coded 'tesseractPath.' > > The all tests passed and it built successfully, but then I went to post > > some tiff's to the server. > > That didn't work. So I tried adding some System.out.println("hello > world") > > (a little crude I know) inside the unit tests to confirm that tesseract > > was working correctly. It looks like something happens in the unit test > in > > TesseractOCRTest.java > > on the line that says TesseractOCRConfig config = new > > TesseractOCRConfig();. Printing to stdout before works, but I get nothing > > after. That happens before the assumeTrue(canRun(config));. So an > exception > > is not get raised. > > > > Then once everything is built, ocr does not work. That was why I > figured I > > would ask to see if I missed some sort of configuration step in building > > it. > > > > Thanks a ton. > > > > > > > > > > > > On Tue, Sep 30, 2014 at 2:57 PM, Mattmann, Chris A (3980) < > > chris.a.mattm...@jpl.nasa.gov> wrote: > > > > > Dear Kevin, > > > > > > Sure, it already works :) 1.7-SNAPSHOT. > > > > > > See this wiki page: > > > > > > https://wiki.apache.org/tika/TikaOCR > > > > > > I¹d be happy to discuss more. > > > > > > Thanks! > > > > > > Cheers, > > > Chris > > > > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > Chris Mattmann, Ph.D. > > > Chief Architect > > > Instrument Software and Science Data Systems Section (398) > > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > > Office: 168-519, Mailstop: 168-527 > > > Email: chris.a.mattm...@nasa.gov > > > WWW: http://sunset.usc.edu/~mattmann/ > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > Adjunct Associate Professor, Computer Science Department > > > University of Southern California, Los Angeles, CA 90089 USA > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > From: kevin slote <kslo...@gmail.com> > > > Reply-To: "dev@tika.apache.org" <dev@tika.apache.org> > > > Date: Tuesday, September 30, 2014 at 8:52 AM > > > To: "dev@tika.apache.org" <dev@tika.apache.org> > > > Subject: OCR with tika-server > > > > > > >Hello all, > > > > > > > >I have been testing out the integration of tika with tesseract. > > > >I was wondering if there is a way to get tika-server to run with > > > >tesseract's OCR capabilities? > > > > > > > >Best > > > > > > > >Kevin Slote > > > > > > > > >