[Scribus] Kerning problems with pdftotext

[email protected] Wed, 23 May 2007 20:45:07 +0100

We are using pdftotext to strip out text from pdf's to prepare for  
search indexing and more.  This works well except with our own pdf's  
(produced in Scribus) which getting badly broken up - we suspect  
through kerning.  The text generated is simply fragmented into  
meaningless chunks. It remains in sequential order and some words are  
fine, but generally it's not working.


We are using (the great) Bitstream Vera which looks so good both on  
screen and in print, however we are also getting the same effect when  
we convert our text to Arial.

1. Has anybody experienced this?  Is this a pdftotext thing?
2. Are there alternative pdf-to-text parsers that anyone would recommend?

Lucien
Oxford Information Labs

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

[Scribus] Kerning problems with pdftotext

Reply via email to