Hi Rob,

Yes, fax2tiff could be one way.
But actually, I'm extracting the CCITTFaxDecode stream data from the PDFs 
and trying to extract OCR text from them.
So, I am trying to do this conversion all in memory instead of writing to 
files.

thanks

On Friday, September 7, 2012 12:52:12 PM UTC+8, rkomar wrote:
>
> On Thu, 6 Sep 2012, newtotesseract wrote: 
>
> > Hi Nick, 
> > I tried passing in the CCITTFaxDecode data to tesseract, 
> > but it was not detected as TIFF. 
> > 
> > It seems like CCITT fax is not same as TIFF. 
> > 
> > Google search showed me that few other people also faced 
> > same issue (e.g."
> http://stackoverflow.com/questions/2641770/extracting-im 
> > age-from-pdf-with-ccittfaxdecode-filter"). 
> > 
> > If you know, how we can convert the CCITT-Fax to tiff or 
> > jpeg, it would be really helpful. 
> > 
> > Many thanks for your help and time. 
> > 
> > Thanks, 
> > - ganesh 
>
> TIFF files can contain many kinds of image data compressed 
> with all sorts of types of compression.  CCITT _is_ one 
> of the supported compression types.  If you can install 
> ImageMagick, then you can use the 'convert' program in that 
> package to create your TIFF file.  For example: 
>
> > convert in.fax -compress Group4 out.tif 
>
> converts the file to TIFF using CCITT Group4 compression 
> in the output. 
>
> Or, if you have libtiff installed, then you can use 
> the fax2tiff program to do the conversion. 
>
> Don't convert to jpeg; it isn't meant for bi-level images. 
>
> Cheers, 
> Rob Komar 
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to