Hi Would you be able to deal with Xmodmap project - http://www.unicode.org/mail-arch/unicode-ml/y2010-m09/0042.html
Best, Meeकu On Sat, Sep 18, 2010 at 10:42 AM, Richard Wordingham < richard.wording...@ntlworld.com> wrote: > On Sat, 18 Sep 2010 00:06:07 +0100 > Krishna Birth <krishnabi...@gmail.com> wrote: > > > Could someone please correctly tell the codes to use on Unix operating > > systems to produce the below diacritics: > > > > A > > Ā = http://www.fileformat.info/info/unicode/char/0100/index.htm > ... > > > I need to find this for a project/coder's question? > > If you are asking how to type these precomposed letters at a keyboard, > we need to know which Unix operating system you have in mind, and the > X-terminal model may be relevant. For example, if the X-terminal is > a Windows PC running Exceed, this may reduce to a Windows question. > > My answer is directed to what one would write in a program. It is > possible that more detail is required as to the coder's problem. > > The codepoint (i.e. number encoding the character) for these letters > is part of the name of the links you gave, e.g. the code for Ā is 0100 > in hex. > > If you are simply trying to produce the single, precomposed character > in a program, the information is given in the table headed 'Encodings' > in the pages you referenced. It may be worth also giving the > information for the plain letter 'A' at > http://www.fileformat.info/info/unicode/char/0041/index.htm so that the > coder may understand the information better. UTF-8 is the encoding > which for most purposes can work on Unix in exactly the same fashion as > 8-bit codes (ASCII, ISO-8859, ISCII, TSCII), though multibyte EUC > encodings are a better analogy. (If the coder doesn't understand EUC, > it's not worth explaining.) > > For example, when I run a terminal window using the locale en_GB.utf8, > I can have the letter printed to the terminal by a bash script using > the command > % printf "\xc4\x80" # Use UTF-8 form explicitly > The printf of bash version 4.1.5(1) does not understand escape codes > using '\u'. > > On the other hand, /usr/bin/printf on the Linux system I'm using does, > and I could achieve the same effect using > % /usr/bin/printf "\u0100" # What happens in non-UTF-8 locales? > > If you want the codes for the diacritics themselves, so that the > letters you listed may be entered as plain Roman letter plus diacritic > mark, the information you need > is in http://www.unicode.org/Public/UNIDATA/UnicodeData.txt , with an > explanation in http://www.unicode.org/reports/tr44/#UnicodeData.txt . > As an example, consider the line for U+0100: > > 0100;LATIN CAPITAL LETTER A WITH MACRON;Lu;0;L;0041 0304;;;;N;LATIN > CAPITAL LETTER A MACRON;;;0101; > > The data items are separated by semicolons. The first two are the > codepoint, the number for the character, expressed in hecadecimal > notation. The second field gives the character name. The interesting > field for you may be the sixth field, which, unless it starts with > '<', gives another way of expressing the same character - in this case > as the sequence of <U+0041 LATIN CAPITAL LETTER A WITH MACRON, U+0304 > COMBINING MACRON>. > > If you want to write the diacritics themselves without attaching them > to a letter, there are two or three methods. Firstly, you can > write them on a hardspace, e.g. <U+00A0 NO-BREAK SPACE, U+0304>. This > will not always work; using the spacing modifier letter is the safe way > of writing it. For this you need to look at their code chart. For the > macron, you will use <U+02C9 MODIFIER LETTER MACRON>. The third > method is to use the ISO-8859 characters, in this case <U+00AF > MACRON>. The drawback with the third method is that this is a symbol, > not a letter, and you may encounter bad line-breaking or the macron may > be combined with a preceding letter. > > Richard. > > >