On Tuesday 20 March 2007 21:17, Carsten Haese wrote: > On Tue, 2007-03-20 at 20:26 -0400, jim-on-linux wrote: > > I have been getting the same thing using > > SQLite3 when extracting data fron an SQLite3 > > database. > > Many APIs that exchange data choose to exchange > text in Unicode because that eliminates > encoding uncertainty. Whether an API uses > Unicode would probably be noted somewhere in > its documentation. > > > I take the database info which is in a list > > and do > > > > name = str.record[0] > > You probably mean str(record[0]) .
Yes, > > > rather than > > name = record[0] > > > > So far, I havn't had any problems. > > For some reason the unicode u is removed. > > I havn't wanted to spend the time to figure > > out why. > > As a software engineer, I'd get worried if I > didn't know why the code I wrote works. Maybe > that's just me. I don't disagree, but sometime depending on the situation, time to investigate is a luxury. However, ( If you don't have the time to do it right the first time when will you have the time to fix it.) > > Unicode is not rocket science. I suggest you > read http://www.amk.ca/python/howto/unicode to > demystify what Unicode objects are and do. > > With str(), you're asking the Unicode object > for its byte string interpretation, which > causes the Unicode object to give you its > encoding in the system default encoding. The > default encoding is normally ascii. That can be > tweaked for your particular Python > installation, but if you need an encoding other > than ascii it's recommended that you explicitly > encode and decode from and to Unicode, lest you > risk writing non-portable code. > > Using str() coercion of Unicode objects will > work well enough until you run into a string > that contains characters that can't be > represented in the default encoding. Right, even though None or null are not strings they are common enough to cause a problem. Try to run a loop through a list with None or null in it. Example, x = str(list[2]) when list[2] = null or None, problems. Easy to fix but more work. I'll check the web site out. Thanks for the update, Jim-on-linux > Once that > happens, you're better off explicitly encoding > the Unicode object into a well-defined encoding > on input, or, even better, just work with > Unicode objects internally and only encode to > byte strings when absolutely necessary, such as > when outputting to a file or to the console. > > Hope this helps, > > Carsten. -- http://mail.python.org/mailman/listinfo/python-list