On 6/14/2010 1:18 PM, Mark E. Shoulson wrote:
On 06/14/2010 02:15 PM, Asmus Freytag wrote:
On 6/14/2010 9:21 AM, Stephen Slevinski wrote:

Plain text SignWriting should be able to write actual sign language,
such as "hello world."
You could equally well insist that it should be possible to express the
opening bar of "twinkle, twinkle little star" in plain text, or write
the "square root of the inverse of a plus b" in plain text.

In both cases, you would be disappointed and find that a markup language
is required, such as MathML, although specifically for math, it is
possible to device an extremely light weight markup language that comes
close to plain text.

It is all too tempting and too easy for discussions of "Why X Should be Encoded in Unicode" to devolve into "Why X is So Incredibly Useful." In this case, I don't think that's the point.
Correct, we were not discussing that question.
Unlike some other proposals, I think it is clear (to me, anyway) that SignWriting has a fairly solid user-base and also an important use (transcribing signed languages, which don't really have too many other ways of being transcribed. Things like HamNoSys are also not encoded yet).
Mark (Davis) raised the good point that this needs to be substantiated - for now, for the purposes of this discussion, I taken the above as a given.
Here, the question is more a matter of "given that SignWriting is nifty, does it qualify as plain text?"

That is the central question.

Or even "Does the way SignWriting does its thing map well to the way Unicode does things?"

I tried to explain that these are nearly equivalent. A practical definition of plain text could be, text encoded as a stream of Unicode characters, with no other information. However, there are other definitions of plain text based on the ideal concept of the thing, and the two don't overlap 100%. Both are useful.

If it does not (and cannot be made to do so), then no matter how useful SignWriting is, it may simply not be encodable. It's not because it doesn't deserve to be, and yes, that would really be a bummer because it would relegate signed languages to second-class, but Unicode has its limitations, and SignWriting may well be beyond its capabilities.

That's where my insistent questions about a layered system come in. One where the elements (symbols) are encoded in Unicode, but where some or all the details of their relation is encoded in a higher level protocol.

I suspect that the XML attempts that exist do not implement a correct layering, that is, they probably encode the identity of the symbols not as character codes but as named entities. That would explain why Steve said "same data, only more complex".

(That said, I find myself thinking that it *should* be possible to align Unicode and SignWriting. But I recognize that it might not be.)
As long as the position of the proponents is that all fine details of formatting and layout must be carried in the character encoding level, I'm not hopeful.


Not all streams of concrete small integers are ispso facto plain text,
even though you can map these integers to the private use space.

I guess you would need to establish a distinct and independent meaning for each code-point, which would have to be something more specific than "...and then you give the x-coordinate."
Generic placement operators I could possibly fathom, since they serve to linearize the text - an analogy would be the Ideographic Description Symbols that allow description of a two dimensional layout. But the IDS stop short of trying to express the subtle modifications that arise out of the context and placement of the elements in the final ideograph. For that you have to turn to another source, in this case a font.

For the future, I am considering a browser plugin that will detect and
render SignWriting character data. A regular expression could scrape
the appropriate PUA characters. Another regular expression could
validate that the characters represent valid structures. Then the
SignWriting display could be built using individual symbols, completed
signs, or entire columns.

In other words, a layout engine.

Is there such a thing as SignWriting without a layout engine? I guess the same question could be asked about Musical notation (though I think it probably could have been coded as plain text. See also http://abcnotation.com/ for a very powerful musical notation using only ascii, but decidedly *not* plain-text in nature).
The point is, because one already requires a layout engine (or browser plug-in) one might as well use something like MathML in conjuction with standard character codes for the basic symbols.

If SignWriting cannot be successfully used except with 2 fonts, then I
see little need for standardizing the code. What you describe is a
private use scheme, even though the private group may have many members.

I'm not sure I agree with this. Just because only two fonts are out there so far, and the character-shapes perhaps allow a little less flexibility than some, doesn't mean that other fonts aren't possible. Nor is the multiplicity of fonts a requirement for encoding.
It's a red flag. If the design is truly this limited, it lacks the generalization necessary to evolve gracefully in use. For Unicode to enshrine such a design using character codes that can never be reassigned (or re-interpreted) would be foolhardy.

SignWriting has the unusual requirement of a 2 color font. One font
color for the line of the symbols and another for the fill. The fill
is needed when symbols overlap.
Hmm.

AFAIK, Unicode can't do color. I remember someone mentioning that once. But someone who knows the exact rules can explain better.
This is a red herring. What he means is that when symbols overlap, their insides block out the symbol underneath. That is a departure of how the usual layout engines work, but perhaps not an unsurmountable obstacle from the point of view of encoding.

I think it will help when your proposal is ready for review so people will understand just what it is you are suggesting and can judge how much (if at all) it conflicts with Unicode's capabilities.

This discussion and the feedback contained in it are designed to help Steve address these issues up-front.

Because sign writing has not had the extensive history of print and then digital publication as mathematics or music, a lot of issues probably need to be settled, and that will take time. To use mathematics as an example: what got encoded in Unicode is not a 1:1 equivalent of what was contained in the most successful mathematical layout system ever (TeX), but something that corresponded (more or less) to Unicode's concept of encoded characters.

That would be useful to remember.

A./




Reply via email to