Use the ImageMagick library. --Sven On Fri, Sep 7, 2012 at 2:23 AM, newtotesseract <[email protected]> wrote: > Hi Rob, > > Yes, fax2tiff could be one way. > But actually, I'm extracting the CCITTFaxDecode stream data from the PDFs > and trying to extract OCR text from them. > So, I am trying to do this conversion all in memory instead of writing to > files. > > thanks > > On Friday, September 7, 2012 12:52:12 PM UTC+8, rkomar wrote: >> >> On Thu, 6 Sep 2012, newtotesseract wrote: >> >> > Hi Nick, >> > I tried passing in the CCITTFaxDecode data to tesseract, >> > but it was not detected as TIFF. >> > >> > It seems like CCITT fax is not same as TIFF. >> > >> > Google search showed me that few other people also faced >> > same issue >> > (e.g."http://stackoverflow.com/questions/2641770/extracting-im >> > age-from-pdf-with-ccittfaxdecode-filter"). >> > >> > If you know, how we can convert the CCITT-Fax to tiff or >> > jpeg, it would be really helpful. >> > >> > Many thanks for your help and time. >> > >> > Thanks, >> > - ganesh >> >> TIFF files can contain many kinds of image data compressed >> with all sorts of types of compression. CCITT _is_ one >> of the supported compression types. If you can install >> ImageMagick, then you can use the 'convert' program in that >> package to create your TIFF file. For example: >> >> > convert in.fax -compress Group4 out.tif >> >> converts the file to TIFF using CCITT Group4 compression >> in the output. >> >> Or, if you have libtiff installed, then you can use >> the fax2tiff program to do the conversion. >> >> Don't convert to jpeg; it isn't meant for bi-level images. >> >> Cheers, >> Rob Komar > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en
-- ``All that is gold does not glitter, not all those who wander are lost; the old that is strong does not wither, deep roots are not reached by the frost. >From the ashes a fire shall be woken, a light from the shadows shall spring; renewed shall be blade that was broken, the crownless again shall be king.” -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

