Paul James Cowie wrote at 6:44 AM on Thursday, May 6, 2004: >Somewhat echoing Deborah Anderson's contribution from a few days ago, I >am categorically against any script unification in this matter and I >believe that Phoenician script should be encoded separately from square >Hebrew script - when I have the need to encode both scripts within one >XML / XHTML document, I want to be sure that both scripts are rendered >accurately without confusion, and without having to step though a font >minefield.
Take this sentence - Phoenician BT 'LM, "house of eternity, grave", occurs (with matres lectionis) in Biblical Hebrew as BYT 'WLM. Here are the polar choices for XML: TAGGED (but not encoded) Phoenician <Phn>BT 'LM</Phn>, "house of eternity , grave", occurs (with matres lectionis) in Biblical Hebrew as <Heb>BYT 'WLM</Heb>. ENCODED (but not tagged) Phoenician BT 'LM, "house of eternity, grave", occurs (with matres lectionis) in Biblical Hebrew as byt 'wlm. [using case to simulate the different encodings] The tagged version is not a "font minefield". On the contrary, it explicitly provides an international standard mechanism for a level of specification and refinement not possible via encoding. You can, for example, do things like: <Phn subscript="Punic" locus="Malta" font="Maltese Falcon">BT 'LM</Phn>. In fact, this is precisely the sort of thing for which XML was designed. The untagged, but differently encoded version, on the other hand, IS a search and text processing quagmire, especially when confronted by the possibility of having to deal with multiplied West Semitic encodings, e.g., for the various Aramaic "scripts" and Samaritan. Obviously there is a need, in many cases, to maintain the distinction between the various diascripts; the question is where should that distinction be introduced - at the encoding level or higher? The issue is not whether this particular proposal represents Phoenician "script" adequately, it does; the real issue is whether Phoenician should be separately encoded at all. If Hebrew were not already encoded in Unicode, I could foresee two, possibly tenable, courses of action: * Unilaterally encode the 22 Old Canaanite letter characters as such, and additionally encode the various supra-consonantal systems erected on this script. This would cover practically everything we've been talking about - Phoenician, Punic, Neo-Punic, Old Hebrew, Moabite, Ammonite, Edomite, Samaritan, Old Aramaic, Official Aramaic, Square Hebrew. etc. (Essentially, with some name changes, the situation we currently enjoy.) or * Separately encode only Old Canaanite and Hebrew/Aramaic, along with its adjunct systems - deferring judgment on the politically-loaded Samaritan issue. (Essentially what we would have if only the current proposal were adopted.) But, what I'm afraid of with this proposal, as I've stated before, is that its adoption will set a precedent that will result in a snowballing of West Semitic encodings, leading to the third scenario, which I find unacceptable: * Separately encode Phoenician, Old Hebrew, Samaritan, Archaic Greek, Old Aramaic, Official Aramaic, Hatran, Nisan, Armazic, Elymaic, Palmyrene, Mandaic, Jewish Aramaic, Nabataean ... I actually have not yet made up my mind about the advisability of encoding Phoenician/Old Canaanite; I continue to weigh the input we've been getting here. But I am tending to think that the tradeoffs are in favor of not separately encoding multiple West Semitic diascripts. The only benefit to encoding I see is the enabling of rendering changes (aka, font changes) in plain text. But weighed against the complexity introduced for searching and other text processing, that benefit seems small indeed, especially when we realize that the discipline has, in large part, worked with unified texts for centuries. Which segues nicely into your next remarks: >A few contributors to this list have argued that separate encoding is >unnecessary and shouldn't happen on the grounds that the user community >doesn't / wouldn't make use of it.... Well, I can certainly tell you >that my user / research community (i.e. Near Eastern history, >archaeology and Egyptology) remains incredibly conservative in nearly >all their practices - their current practice overall is certainly no >guide to what *should* be happening.... Some of us *are* trying to >pioneer and teach different practices - the use of XML / XHTML, the >application of Unicode instead of different fonts, for example - but it >is a slow, slow process. I am sympathetic to this assessment of the conservative nature of many practices in Ancient Near Eastern studies. (After all, I have personally witnessed resistance to my, and others, efforts to encode Sumero/Akkadian cuneiform.) But to say "their current practice overall is certainly no guide to what *should* be happening" is too strong for me. I tend to try to look for the best in the past in order to combine that with the best in the present. In this particular case, that MAY mean encoding Old Cannanite, or it may not. But I have yet to see a compelling reason to introduce the added complexity. Respectfully, Dean A. Snyder Assistant Research Scholar Manager, Digital Hammurabi Project Computer Science Department Whiting School of Engineering 218C New Engineering Building 3400 North Charles Street Johns Hopkins University Baltimore, Maryland, USA 21218 office: 410 516-6850 cell: 717 817-4897 www.jhu.edu/digitalhammurabi