John Cowan <jcowan at reutershealth dot com> wrote: > Most languages other than C define a string as a sequence of > characters rather than a sequence of non-null characters. The > repertoire of characters than can exist in strings usually has a lower > bound, but its full magnitude is implementation-specific. In Java, > exceptionally, the repertoire is defined by the standard rather than > the implementation, and it includes U+0000. In any case, I can think > of no language other than C which does not support strings containing > U+0000 in most implementations.
In Pascal, which I learned before C, strings were implemented as a count of characters followed by the characters themselves. Unfortunately, the count was a single byte, and the resulting maximum string length of 255 was a much greater inconvenience in real life than C's prohibition against a string containing 0x00. I don't know if modern Pascal implementations are the same way. A 32-bit length count, followed by an array of N arbitrary Unicode characters, would probably be the best implementation today. I'd still like to know what practical, real-world TEXT-related benefits would derive from allowing U+0000 in strings of TEXT in a C program. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/