At 03:41 PM 12/29/01 -0500, David J. Perry wrote:
>The ancient Roman monetary unit sestertius is not yet in Unicode.  It might
>well be accepted if proposed, but would be given one codepoint.  However,
>this unit appears in a variety of ways in inscriptions: IIS, HS, II with a
>horizontal line through, S or SS with horizontal line, etc.  Epigraphers
>frequently like to preserve information such as the exact glyph used in an
>inscription.  One could create an OpenType font with one sestertius
>character and alternative glyphs that could be used for printing or web
>pages.  But would there be any way to preserve such information in, let's
>say, a database of inscriptions if only one codepoint was available?  The
>Runic block that was added to Unicode 3.0 also comes to mind here.  TUS 3
>states that the glyphs used in a given context may vary from those presented
>in the charts; so what were the intentions of those who proposed this block?
>This seems to be the same issue as the one I raised regarding the
>sestertius.

This is something that you cannot do in plain text. It's a fundamental 
limitation. Same as you cannot maintain a database of instances where the 
dollar sign or Yen/Yuan sign appears with single or double strokes, by just 
using U+0024 or U+00A5.

To a limited extent it makes sense to encode more than one historic form of 
a character. Usually it may be considered in cases where such historic 
forms can be considered a historic script (or historic alphabet) in their 
own right, and  when separating the historic periods solves more problems 
than it creates. (So it's a judgement call above all).

Sometimes if the forms are very unrelated by appearance and especially if 
there is a possibility that at least one of them might be used for an 
unrelated meaning, it might make sense to encode both.

Finally, where currencies are written with simple digraph letters, there is 
no need to encode a single character. If your examples IIS and HS don't 
have lines through them, there wouldn't be a need to encode them. The 
strings IIS or HS should serve.

So, given my understanding of your example, I could see at most two 
possible forms, II with a line throug and SS with a line through. If both 
of these are substitutable (except for capturing the 'exact' appearane of 
an inscription) then they should get a single shared character code.

In a few cases, where clear glyph alternatives exist and where there is a 
strongrequirement to preserve them in plain text, the use of a Variation 
Selector character can be defined, allowing one to express the distinction. 
This is a very useful facility to make unnecessary the encoding of many 
borderline characters, but should not be abused as a general glyph 
description mechanism. One would need to show in each case that the 
distinction must be preserved (at least sometimes) in plain text, even 
though there is no ordinary distinction in meaning.

A./

Reply via email to