I was wondering this myself, as we need the ability to process at least PDFs. I think the answer is, yes, you do have to render to pixel first. At least that's the conclusion I've come to. I checked leptonica protos for a covert *from* PDF, but there doesn't seem to be any. They do have various methods to convert *to* PDF, but not the other way around. I guess that means we just use another open source library to do that first.
If you find a good solution, let me know. That's one of my upcoming tasks :) On Tuesday, December 11, 2012 9:36:05 AM UTC-5, thanatos thanatica wrote: > > Unfortunately, I could not find a list of supported image input types > anywhere, so I just started to play with what I can produce. I tried SVG, > EPS, PDF, PS, and ODG, but all of them report as unsupported. > So the question remains: which vector type can I use as input? Or do I > have to convert to a pixel image first? > I would think that supplying vector images would greatly increase > accuracy... > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

