I am converting PDF files to fixed layout ePub files, which mean mainly converting PDF to HTML.
I noticed something strange. When converting PDF files produced by Scribus, the HTML displays very well, but when I copy/paste a selected text, there is no ?spaces? in the text! E.g.: - On the screen you see: ?The red car was behind the house." - The copy/paste gives: ?Theredcarwasbehindthehouse.? I found a PDF file produced by InDesign, and with that file the problem is not there. After analyzing with Acrobat the fonts embedded in the PDF files, I noticed: - Indesign: fonts contain the <SPACE> (code 20 in hexa, 32 in decimal). - Scribus: fonts does not contain the <SPACE> (code 20 in hexa, 32 in decimal). I am using PDFTron to convert PDF files to ePub files. When I use the pdf2htmlEX tool (available for Linux and Windows), the problem is not there. It seems that PDFTron will only insert a space in the text when the code 32 is in the text. How comes there is no spaces with code 32 in the PDF produced by Scribus? I know there is a lot of different spaces: U+0020 SPACE, U+00A0 NO-BREAK SPACE, U+2000 EN QUAD 1 en (= 1/2 em), U+2001 EM QUAD 1 em, etc. Thanks, Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.scribus.net/pipermail/scribus/attachments/20141213/7928926c/attachment.html>
