On 06/01/2010 10:25 AM, Jonathan Kew wrote:
On 31 May 2010, at 22:13, Pablo Rodríguez wrote:
[...]
If you copy the resulting text (from
http://www.ousia.tk/wrong-letterspace.pdf), you will see that only the second
line is properly typeset, or at least, there are no blank spaces between
letters.
I guess this might be a probable cause for wrong hyphenation when using
LetterSpace. (BTW, loading polyglossia makes no difference.)
Have I hit a bug in LetterSpace? Do you know any way to avoid this?
The PDF looks correct to me; where LetterSpace=12 is in effect, the letters are
more widely spaced, and where LetterSpace=0, they're not. I don't see a bug
here. Or am I missing something?
Thanks for your reply, Jonathan.
I'm not especially interested in LetterSpace, but in hyphenation with
Letterspace (as you can see at http://www.ousia.tk/grammatike.pdf).
And I thought that the described issue might influence the wrong
hyphenation (but I got it wrong).
If you're specifically concerned about what happens when you use a viewer to select and
copy the text from this PDF into an editor... well... that's a chancy operation. It
worked fine for me with Acrobat (no extra spaces), but other viewers may give different
results. Basically, this is a poorly-defined operation. As TeX does not use "space
characters" between words, there is no clear indication in the PDF data of where the
word boundaries should be, and so the viewer has to guess based on the glyph positions.
That works most of the time for simple running text, but modifying the letter spacing
carries a pretty high risk of confusing it.
BTW, acroread-9.3 in Ubuntu-10.04 copies the following text (the same
text that evince 2.30 copies):
χαλεπὰ τ ὰ κ α λ ά
χ α λ ε π ὰ τ ὰ κ α λ ά
χ α λ ε π ὰ τὰ καλά
Beauty i s d i ffi c u l t
B e a u t y i s d i ffi c u l t
B e a u t y is difficult
The general issue with LetterSpace is not text extraction itself, but
the ability to search for a given text.
Thanks for your help,
Pablo
--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex