Re: Questions about Unicode-aware C programs under Linux

2007-04-16 Thread Rich Felker
On Tue, Apr 17, 2007 at 10:46:44AM +0430, Ali Majdzadeh wrote: > Hello Rich > Thanks for your response. > About your question, I should say "yes", I need some text processing > capabilities. OK. > Do you mean that I should use common stdio functions? (like, fgets(), ...) Yes, they'll work fine.

Re: Questions about Unicode-aware C programs under Linux

2007-04-16 Thread Ali Majdzadeh
Hello Rich Thanks for your response. About your question, I should say "yes", I need some text processing capabilities. Do you mean that I should use common stdio functions? (like, fgets(), ...) And what about UTF-8 strings? Do you mean that these strings should be stored in common char* variables

Re: wcwidth and locale

2007-04-16 Thread Rich Felker
On Tue, Apr 17, 2007 at 02:04:32AM +0800, Abel Cheung wrote: > >not all we like, but can you come up with things that should > >legitimately be wide (i.e. ideographs) which have no chance to enter > >Unicode? > > Certain there are, say some belonging to Taiwan CNS11643, which > is regarded as vari

terminal status [Re: wcwidth and locale]

2007-04-16 Thread Rich Felker
On Tue, Apr 17, 2007 at 02:04:32AM +0800, Abel Cheung wrote: > >This is only an issue on character-cell devices which use wcwidth. > > I'm exactly talking about those apps, like terminals. Given how utterly abysmal current terminals' Unicode support is, this seems like a relatively minor issue. I

Re: wcwidth and locale

2007-04-16 Thread Abel Cheung
On 4/17/07, Rich Felker <[EMAIL PROTECTED]> wrote: > It really depends on the intended audience of the fonts. The original > intention for those double width Greek and Cyrillic characters is to > make them align nicely with all other CJK characters. Then there are > no such thing as wide Greek/Cy

Re: Questions about Unicode-aware C programs under Linux

2007-04-16 Thread SrinTuar
The best advice you can get is to steer clear of wide characters. You should never need to use any wide character functions. Keep the data in your program internally represented as utf-8. The standard byte-oriented "strlen", "strcpy", "strstr", "printf" etc work fine with utf-8. XML uses utf-8 by

Re: Questions about Unicode-aware C programs under Linux

2007-04-16 Thread Rich Felker
On Mon, Apr 16, 2007 at 11:33:26AM +0330, Ali Majdzadeh wrote: > Hello All > Sorry, if my questions are elementary. As I know, the size of wchar_t data > type (glibc), is compiler and platform dependent. What is the best practice > of writing portable Unicode-aware C programs? Is it a good practice

Re: wcwidth and locale

2007-04-16 Thread Rich Felker
On Tue, Apr 17, 2007 at 12:11:12AM +0800, Abel Cheung wrote: > On 4/11/07, Rich Felker <[EMAIL PROTECTED]> wrote: > >Indeed, glibc's character data is horribly outdated and incorrect. > >There are plenty of unsupported nonspacing characters, even characters > >that were present in Unicode 4.0. It a

Re: wcwidth and locale

2007-04-16 Thread Abel Cheung
On 4/11/07, Rich Felker <[EMAIL PROTECTED]> wrote: Indeed, glibc's character data is horribly outdated and incorrect. There are plenty of unsupported nonspacing characters, even characters that were present in Unicode 4.0. It also considers nonspacing letters to be non-alphabetic, which is a real

Questions about Unicode-aware C programs under Linux

2007-04-16 Thread Ali Majdzadeh
Hello All Sorry, if my questions are elementary. As I know, the size of wchar_t data type (glibc), is compiler and platform dependent. What is the best practice of writing portable Unicode-aware C programs? Is it a good practice to use Unicode literals directly in a C program? I have experienced s