Title: RE: My Querry

> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]] On Behalf Of Harshal Trivedi
> Sent: Tuesday, November 23, 2004 3:42 AM

> How can i make sure that UTF-8 format string has terminated
> while encoding it, as compared to C program string which ends
> with '\0'
> (NULL) character?
>
> -> Is there any special symbol or procedure to determine end of UTF-8
> string OR just ASCII NULL '\0' is used as it is to indicate that.

        You can use the method used by C (often called "C strings" or "null terminated strings", in which a byte with value 0 signals the end of the string.  However, as recently (and vigorously) discussed here, this transfer encoding scheme has the potentially problematic property of prohibiting use of the character at code point 0, NUL.  This does not tend to be a problem for most uses, but one should still be aware of it.

        Another method is length encoded strings, as used by Java and MFC's CString class, where the length of the string data in bytes is encoded and presented first, and the bytes are handled opaquely.

        Either method will do, as there is no TES or data structure explicitly assigned to UTF-8.  Use the one best suited for your application.  You may want to read UTR 17, "Character Encoding Model", at http://www.unicode.org/reports/tr17/.


        HTH,

/|/|ike

"Tumbleweed E-mail Firewall <tumbleweed.com>" made the following
annotations on 11/23/04 10:38:32
------------------------------------------------------------------------------
This e-mail, including attachments, may include confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.
==============================================================================

Reply via email to