Freddie Unpenstein wrote: > From: "Nikolai Weibull", 09/10/2008 02:01: >> On Wed, Oct 8, 2008 at 13:20, Havoc Pennington <[EMAIL PROTECTED]>; wrote: >>> Another way to put it, I don't think nul bytes are a user-explainable >>> concept. If anybody who isn't a programmer sees (how? what's the >>> glyph?) a nul byte in a _text_ file, that's just bizarre. >> How is "oh, you can't open /that/ file in a text editor because it has >> a character in it that isn't a user-explainable concept" (I'm not >> trying to make a straw man argument) better than simply opening the >> file, displaying the NUL as a box with 0000 in it (like Pango does for >> other characters it can't render) and be done with it? I don't see >> how it's the programs responsibility to state what can and what cannot >> be in a file the user wants to open, as long as the file is valid in >> the chosen encoding. > > Why not just adopt the old thing of encoding NULLs and other non-UTF-8 > characters as safe UTF-8 equivelants...?
Because they are not valid UTF-8? And the moment we give up dealing with valid UTF-8 a whole other can of worms opens up. behdad > I've seen the practice of > representing \0 as \UC080 (or however it's specified) recommended in a > secure programming document as a measure for avoiding accidents > (especially when you're using someone else's libraries), and plenty of > other softwares and toolkits do it. C's use of NULLs is an > implementation detail of C, it shouldn't be inflicted on everything else. > > There's no need for every API function taking a text string (as opposed > to Glib functions that may well be storing binary strings) to also have > a version that takes a length, and for every string value throughout GTK > to carry around a length value and all the extra work needed to work > with length/buffer pairs over simple NULL-terminated strings. Especially > when most of them don't handle binary anyhow. > > Still doesn't answer the rendering issue, but personally, a NULL > shouldn't have any special meaning in a string to be displayed. Whether > it gets rendered as a box with 0's, or a zero width solid space, or > whatever else, is another issue entirely. But it shouldn't require extra > effort to handle it... Simply label it a binary character, and encode it > up in the binary-to-UTF-8 functions. It can then be displayed however > someone else decides, and be converted back into the original NUL by a > UTF-8-to-binary function later on. > > > Fredderic > ------------------------------------------------------------------------ > Landscape Lighting > <http://tagline.excite.com/fc/JkJQPTgKhAAQBYgxgxy2oD1M3LPkz5uJQ0mtmB1vbsKRJa7ZxY6GmP/> > Click here to save on landscape lighting. Top brands. > <http://tagline.excite.com/fc/JkJQPTgKhAAQBYgxgxy2oD1M3LPkz5uJQ0mtmB1vbsKRJa7ZxY6GmP/> > Click here for more information > <http://tagline.excite.com/fc/JkJQPTgKhAAQBYgxgxy2oD1M3LPkz5uJQ0mtmB1vbsKRJa7ZxY6GmP/> > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > gtk-devel-list mailing list > [email protected] > http://mail.gnome.org/mailman/listinfo/gtk-devel-list _______________________________________________ gtk-devel-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/gtk-devel-list
