There's really no inherent need for
many spacing combining marks to have a base character. At least
the ones that do not reorder and that don't overhang the base
character's glyph.
As far as I canĀ tell, it's largely a
convention that originally helped identify clusters and other lack
of break opportunities. But now that we have separate properties
for segmentation, it's not strictly necessary to overload the
combining property for that purpose.
In you example, why do you need the ZWJ
and dotted circle?
Originally, just applying a combining
mark to a NBSP should normally show the mark by itself. If a font
insists on inserting a dotted circle glyph, that's not required
from a conformance perspective - just something that's seen as
helpful (to most users).
A./
On 7/21/2019 4:03 PM, Richard
Wordingham via Unicode wrote:
I've been transcribing some Pali text written on palm leaf in the Tai Tham script. I'm looking for a way of reflecting the line boundaries in a manuscript in a transcription. The problem is that lines sometimes start or end with an isolated spacing mark. I want my text to be searchable and therefore encoded in Unicode. (I appreciate that There is a trade-off between searchability and showing line boundaries. The unorthodox spelling is also a problem.)How unreasonable is it for a font to render <NBSP, ZWJ, U+25CC DOTTED CIRCLE, spacing_mark> as just the spacing mark? Some rendering systems give the font no way of distinguishing dotted circles in the backing store from dotted circles added by the renderer, so this technique is not Unicode compliant. An alternative solution is to have a parallel font (or, more neatly, a feature) that renders some base character (or sequence) as a zero-width non-inking character. This, however, would violate that character's identity. I suspect there is no Unicode-compliant solution. Richard.
|
- Displaying Lines of Text as Line-Broken by ... Richard Wordingham via Unicode
- Re: Displaying Lines of Text as Line-B... Asmus Freytag via Unicode
- Re: Displaying Lines of Text as Li... Richard Wordingham via Unicode