RE: Designing a format for research use of the PUA in a RTL mode (from Re: RTL PUA?)
Thank you to Doug and to Asmus for replying. Originally I was thinking of the format simply being so as to help to level the infrastructural ground as between a PUA (Private Use Area) application using left-to-right characters and a PUA application using right-to-left characters. However, the research needs to proceed in the best direction so as to get the best possible result, so I am happy for my original idea to be augmented and changed if that is what is needed. Do any people who would like to use PUA applications that use right-to-left characters have any views on a format please? Is such a format regarded as useful? What does it need to do? What would be the features of a very minimal RTL constructed script that would exhibit all of the features for which a researcher might want to use the Private Use Area for research with a real-world RTL script please? I am thinking of making a small font with some characters that consist of a leftward pointing arrow with a broad tail with the tail having markings to give a clue to the sound. These markings would be based on the hatching system used for representing colours in monochrome. For example, vertical lines for r because that is red or rouge, horizontal lines for b because that is blue or bleu. I thought of having an o as an o drawn with a left arrow attached to it. I could then produce a glyph for a br ligature and maybe a rb ligature. I am thinking that the ligature glyphs could be wider, have only one leftward pointing arrow yet have two types of markings on the tail of the arrow, side by side. Would that and a space be enough for a constructed script that would exhibit the needed properties for a demonstration or would some more glyphs be needed? My thinking is that the font, complete with its PUA.RTL assignment statement, could be a benchmark test font for testing a special researcher's edition of a wordprocessing application or a desktop publishing application. By using a font for a minimal constructed script, the task of producing and testing the special researcher's edition of a software application could be separated from the complexities of a full real script, perhaps therefore increasing the chances of the special researcher's edition of a software package being produced. I feel that I could make the font as a TrueType font. In order to produce an OpenType font I would need to consolidate what I have started to learn about OpenType fonts, though I would be happy for the TrueType font to be adapted by other people if they so wish. William Overington 24 August 2011
RE: Designing a format for research use of the PUA in a RTL mode (from Re: RTL PUA?)
William_J_G Overington wjgo underscore 10009 at btinternet dot com wrote: Suppose that a a special researcher's edition of a wordprocessing application or a desktop publishing application at start up looks in a specified directory for a file with the following file name. pua_major.txt If pua_major.txt exists, then it is opened and it is searched for a PUA.RTL assignment statement. If a PUA.RTL assignment statement is not found in the file, it is taken as if the following had been included in the file. PUA.RTL=; ... Of all applications, a word processor or DTP application would want to know more about the properties of characters than just whether they are RTL. Line breaking, word breaking, and case mapping come to mind. I would think the format used by standard UCD files, or the XML equivalent, would be preferable to making one up: E100;ENGSVANYALI LETTER P;Lo;0;R;N; E101;ENGSVANYALI LETTER B;Lo;0;R;N; E102;ENGSVANYALI LETTER M;Lo;0;R;N; ... -- Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14 www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell
Re: Designing a format for research use of the PUA in a RTL mode (from Re: RTL PUA?)
On 8/23/2011 7:22 AM, Doug Ewell wrote: Of all applications, a word processor or DTP application would want to know more about the properties of characters than just whether they are RTL. Line breaking, word breaking, and case mapping come to mind. I would think the format used by standard UCD files, or the XML equivalent, would be preferable to making one up: The right answer would follow the XML format of the UCD. That's the only format that allows all necessary information contained in one file, and it would leverage of any effort that users of the main UCD have made in parsing the XML format. An XML format shold also be flexible in that you can add/remove not just characters, but properties as needed. The worst thing do do, other than designing something from scratch, would be to replicate the UnicodeData.txt layout with its random, but fixed collection of properties and insanely many semi-colons. None of the existing UCD txt files carries all the needed data in a single file. A./
RE: Designing a format for research use of the PUA in a RTL mode (from Re: RTL PUA?)
Asmus Freytag asmusf at ix dot netcom dot com wrote: The right answer would follow the XML format of the UCD. Question: Since the ucdxml formats became available, has any consensus emerged as to whether the flat or grouped formats are preferred? Obviously they both contain the same data, but one is much smaller and the other might be more convenient in some ways. -- Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14 www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell