https://bugs.documentfoundation.org/show_bug.cgi?id=161514

--- Comment #7 from David Huggins-Daines <d...@ecolingui.ca> ---
Ah, okay.  In actual fact the "coalescing" should just not be done, because the
font embedded in the PDF still contains the three separate characters
<01>(=U+0078) <02>(=U+030C) and <03>(=U+0075) for display.  The <02> character
is not there by mistake, it is the actual character in the font.

I suggest finding whatver change caused entries in the ToUnicode CMap to be
clustered in this sense and just reverting it because there is no way the
extracted text can ever be valid aside from using the /ActualText tag, which
every PDF viewer I've tried this on does not actually look at.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to