"kefas" wrote:
> 1. U+034F CGJ, Combining Grapheme Joiner, is
> displayed as a tall rectangle in MSKLCexe-test and as
> a capital square in OutlookExpress AÍE aÍeÍaÍe. But
> CGJ "has no visible glyph"! Thus CGJ is not
> implemented correctly in Arial Unicode MS. Or are the
> editors not imp
Hello,
Can anyone please tell me how to convert from
UTF-8 to shift-JIS?
Please let me know if there is any formula to do it other than
using readymade functions as provided by pearl. Because these functions do not
provide mapping for all characters.
Warm Regards,Pragati Desai.
Cybag
The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:
http://www.unicode.org/review/
Review period for the new item closes on January 31, 2005.
Please see the page for links to discussion and relevant documents.
Brief
On 24/11/2004 22:23, Peter Kirk wrote:
On 24/11/2004 22:00, Asmus Freytag wrote:
...
The sequence SPACE NBSP *does* not allow a break after the SPACE
under the line breaking rules we publish in UAX#14.
The common usage in HTML, is to use one or more NBSP followed by
SPACE to mark a wider space,
At 04:53 PM 11/24/2004, Peter Kirk wrote:
On 24/11/2004 22:23, Peter Kirk wrote:
On 24/11/2004 22:00, Asmus Freytag wrote:
...
The sequence SPACE NBSP *does* not allow a break after the SPACE under
the line breaking rules we publish in UAX#14.
I tried to change does not into *does* and missed dele
There have been a number of updates to Public Review Issues on the Unicode
web site.
The comment periods for Public Review Issues 51, 53, 54, and 56 have been
extended to January 31, 2005. During the review period, new drafts may be
issued, and if so, they will be announced at the time.
On 24/11/2004 22:00, Asmus Freytag wrote:
...
The sequence SPACE NBSP *does* not allow a break after the SPACE under
the line breaking rules we publish in UAX#14.
The common usage in HTML, is to use one or more NBSP followed by SPACE
to mark a wider space, that allows a break at the end. NBSPs a
On 24/11/2004 20:22, Jony Rosenne wrote:
Ketiv and Qere, were two different words are written together, are not plain
text and are thus out of scope for Unicode.
For Unicode, one could either choose one version or the other or write them
both separately.
The forms I refer to are the ones print
Jony Rosenne wrote at 10:22 PM on Wednesday, November 24, 2004:
>Ketiv and Qere, were two different words are written together, are not plain
>text and are thus out of scope for Unicode.
Actually, it's the vowels of one word written with the consonants of
another (or just written by themselves w
Jony Rosenne wrote:
This isn't what I said. I said it isn't a Unicode problem because it isn't
plain text.
And I don't understand how you are making this distinction between writing two words
separately being plain text and combining them being not plain text. In what way is it not
plain text? W
At 04:23 PM 11/23/2004, Chris Jacobs wrote:
Now, this implies that UTF-8 does interpret U+ as an ASCII NULL
control char.
This is incompatible with using it as a string terminator.
Except that it's up to you how to interpret the C0 control codes in Unicode.
You can do it according to ISO 6429
At 04:36 AM 11/24/2004, Peter Kirk wrote:
I understand that the proposed INVISIBLE CHARACTER was rejected at the
recent UTC meeting. I presume that the intention is that NBSP should be
used instead.
At the moment, NBSP is the only sanctioned base character without 'ink'.
There are cases of words
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of John Hudson
> Sent: Wednesday, November 24, 2004 11:01 PM
> To: 'Unicode List'
> Subject: Re: No Invisible Character - NBSP at the start of a word
>
>
> Jony Rosenne wrote:
>
> > Ketiv and Qere, we
Jony Rosenne wrote:
Ketiv and Qere, were two different words are written together, are not plain
text and are thus out of scope for Unicode.
Writing them in a combined way results in some sequences of characters that are very
problematic from a rendering perspective, but there is a long standing
Ketiv and Qere, were two different words are written together, are not plain
text and are thus out of scope for Unicode.
For Unicode, one could either choose one version or the other or write them
both separately.
Jony
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PRO
Tim Greenwood asked:
> > All of the spacing combining marks (general category Mc) except
> > musical symbols have a canonical combining class of 0. So, for example
> >
> > 0B95 (TAMIL LETTER KA) 0BC7 (TAMIL VOWEL SIGN EE - stands to the left
> > of the consonant) 0BBE (TAMIL VOWEL SIGN AA - on th
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf
> Of Tim Greenwood
> All of the spacing combining marks (general category Mc) except
> musical symbols have a canonical combining class of 0. So, for example
>
> 0B95 (TAMIL LETTER KA) 0BC7 (TAMIL VOWEL SIGN EE - stands to the left
>
I understand that the proposed INVISIBLE CHARACTER was rejected at the
recent UTC meeting. I presume that the intention is that NBSP should be
used instead.
There are cases of words which start with spacing combining marks, for
which there are no separate Unicode characters. For example, there
All of the spacing combining marks (general category Mc) except
musical symbols have a canonical combining class of 0. So, for example
0B95 (TAMIL LETTER KA) 0BC7 (TAMIL VOWEL SIGN EE - stands to the left
of the consonant) 0BBE (TAMIL VOWEL SIGN AA - on the right) is
canonically distinct from 0B95
1. U+034F CGJ, Combining Grapheme Joiner, is
displayed as a tall rectangle in MSKLCexe-test and as
a capital square in OutlookExpress AÍE aÍeÍaÍe. But
CGJ "has no visible glyph"! Thus CGJ is not
implemented correctly in Arial Unicode MS. Or are the
editors not implemented correctly? Should A+
On Wednesday, November 24th, 2004 04:02Z Harshal Trivedi va escriure:
> How can i determine end of UCS-2/UCS-4 string while encoding it in C
> program?
It depends how you are storing and more importantly managing it.
If you consider it as mere arrays of uint16_t/uint32_t, with your own
function
21 matches
Mail list logo