I am converting PDF files to fixed layout ePub files, which mean mainly
converting PDF to HTML.

I noticed something strange.


When converting PDF files produced by Scribus, the HTML displays very well,
but when I copy/paste a selected text, there is no ?spaces? in the text!

E.g.:


- On the screen you see: ?The red car was behind the house."

- The copy/paste gives: ?Theredcarwasbehindthehouse.?

I found a PDF file produced by InDesign, and with that file the problem is
not there.

After analyzing with Acrobat the fonts embedded in the PDF files, I noticed:


- Indesign: fonts contain the <SPACE> (code 20 in hexa, 32 in decimal).

- Scribus: fonts does not contain the <SPACE> (code 20 in hexa, 32 in
decimal).

I am using PDFTron to convert PDF files to ePub files. When I use the
pdf2htmlEX tool (available for Linux and Windows), the problem is not there.

It seems that PDFTron will only insert a space in the text when the code 32
is in the text.

How comes there is no spaces with code 32 in the PDF produced by Scribus?

I know there is a lot of different spaces: U+0020 SPACE, U+00A0 NO-BREAK
SPACE, U+2000 EN QUAD 1 en (= 1/2 em), U+2001 EM QUAD 1 em, etc.

Thanks,

Eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.scribus.net/pipermail/scribus/attachments/20141213/7928926c/attachment.html>

Reply via email to