Doug Ewell posted:
The use of NULL to terminate strings is a basic part of the Standard C
library, not just certain APIs. As such, it doesn't seem right to call
this a "misuse" of the character.
But ISO 646, in defining ASCII, states as the defintion of the control
character NULL:
"A control
> Subject: Re: discovering code points with embedded nulls
>
>
> What is that strange file (winmail.dat) attached to
> your mail? I really
> hope that it isn't a virus.
>
> Stefan
>
> Kent Karlsson wrote:
Stefan Persson wrote:
> What is that strange file (winmail.dat) attached to your
> mail? I really hope that it isn't a virus.
http://support.microsoft.com/default.aspx?scid=KB;en-us;q241538
(Whether MS Outlook is a virus or not, is still a debated issue. :-)
_ Marco
Doug Ewell wrote:
> Kent Karlsson wrote:
>
> >> From what I'm hearing from you all is that a null in UTF-8 is
> >> for termination and termination only.
> >> Is this correct?
> >
> > No, NULL is a character (actually a control character) among many
> > others. However, many C/C++ APIs (mis)use NU
What is that strange file (winmail.dat) attached to your mail? I really
hope that it isn't a virus.
Stefan
Kent Karlsson wrote:
From what I'm hearing from you all is that a null in UTF-8 is
for termination and termination only.
Is this correct?
No, NULL is a character (actually a contro
Kent Karlsson wrote:
>> From what I'm hearing from you all is that a null in UTF-8 is
>> for termination and termination only.
>> Is this correct?
>
> No, NULL is a character (actually a control character) among many
> others. However, many C/C++ APIs (mis)use NULL as a string terminator
> since
> From what I'm hearing from you all is that a null in UTF-8 is
> for termination and termination only.
> Is this correct?
No, NULL is a character (actually a control character) among many
others. However, many C/C++ APIs (mis)use NULL as a string terminator
since NULL isn't very useful for othe
Erik followed up:
> From what I'm hearing from you all is that a null
> in UTF-8 is for termination and termination only.
> Is this correct?
Not quite. A null byte (0x00) in UTF-8 is only a
representation of the NULL character (U+). It can
be present in UTF-8 for whatever purposes one might
I'm replying to myself, here.
Thank you all for so many quick and helpful responses.
As most of you pointed out, I misread the documentation -- which is doc for multi-byte
strings only (and not wide strings).
So I was brain dead when I asked about encodings other than UTF-8.
The doc states (in
[EMAIL PROTECTED] wrote:
I'm dealing with an API that claims it doesn't support unicode characters with embedded nulls.
...
Test all constituent bytes for 0x00.
This depends on the encoding form you are using (and the API is expecting):
- UTF-8 encodes a Unicode string into a sequence of by
Erik Ostermueller wrote:
> I'm dealing with an API that claims it doesn't support
> unicode characters with embedded nulls.
> I'm trying to figure out how much of a liability this is.
If by "embedded nulls" they mean bytes of value zero, that library can
*only* work with UTF-8. The other two UTF'
Are you sure the API doesn't support Unicode _characters_ with embedded
NULs? Or does it fail to support Unicode _strings_ with embedded NULs?
If it really is the former, no character in UTF-8 (except, of course,
U+) will include a NUL byte. In UTF-16, it will be any character of the
form U+00
12 matches
Mail list logo