Hi everybody,

It seems poppler is being unable to extract text in some PDF files:

http://iteisa.com/tmp/poppler-sample.pdf (11 Mb)

pdftohtml from poppler 0.12.4 and 0.12.2 is not able to extract the text, and evince shows the document correctly but it's unable to select it's text. However acroread shows and selects the text correctly (so it's normal, editable text and not an image).

Is it normal? Is there any workaround for this?

Everything seems ok with the file:

$ pdfinfo poppler-sample.pdf
Title:          untitled
Creator:        Adobe InDesign CS4 (6.0.4)
Producer:       Acrobat Distiller 9.0.0 (Windows)
CreationDate:   Wed May  5 09:35:12 2010
ModDate:        Wed May  5 09:35:12 2010
Tagged:         no
Pages:          208
Encrypted:      no
Page size:      595.276 x 841.89 pts (A4)
File size:      10536602 bytes
Optimized:      no
PDF version:    1.4

Best regards,

--
Jaime GÓMEZ OBREGÓN (ja...@iteisa.com)
http://www.iteisa.com
Teléfono: +34 902055277
ITEISA DESARROLLO Y SISTEMAS, S.L
Benidorm, 8 bajo. 39005 Santander.
España
_______________________________________________
poppler mailing list
poppler@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to