David Starner wrote:

> I have a copy of Shellbear's Practical Malay Grammar that I'm preparing
> to transcribe for Project Gutenberg. Unfortunately, he represents the
> Malaysian alphabet in a Latin transliteration that includes ng as a
> single ligatured form, and I don't know how to transcribe in Unicode.

Could you perhaps post or point to a picture of what it looks like?  I  
suppose it's an "N" with a loopy tail of some type.

The character you are looking for is probably U+014B in lowercase or  
U+014A in uppercase.  I would be rather surprised if that's not what you're  
looking for.

Another way to approach this would be to put a Perl script in the  
Gutenberg Edition header info so that users who wanted to do so could  
extract the script, run it, and transliterate the file into UTF-8.  Then  
put your edition out in pure ASCII with /ng/ for the ligatured form and  
note that it's equivalent to U+014B.

BTW, a bit off topic here but: I think it's high time that Project  
Gutenberg adopted some very clear character encoding guidelines now that  
they're expanding so widely.  Or have they already adopted them and I've  
just missed the policy statement...?  They're in for a real mess if they  
don't specify character encodings in a very controlled way.

        Rick

Reply via email to