Hey Brian, I am trying to deal with a similar situation in my app. I would love it if someone offered a good general solution for dealing with unexpected non-UTF-8 data in a string. There must be a way because my web browsers can display the data without crashing! =)
The admittedly poor solution that I have been using as a workaround is creating a "CheckStringProperty" and using it in place of Google's db.StringProperty: class CheckStringProperty(db.StringProperty): def validate(self, value): try: value = unicode(value) except UnicodeDecodeError: # string contains bad values logging.warn("Encoding bad string with escapes: " + str ([ch for ch in value])) value = unicode(str(value).encode('string_escape')) return super(db.StringProperty, self).validate(value) Lenza blog.lenza.org On Feb 25, 1:17 pm, Brian <bsmcconn...@gmail.com> wrote: > Hi. I have observed a sporadic Unicode related bug that appears to be > browser specific. It causes a db.put() to fail. I am not doing > anything unusual with the incoming text except to put it in a > variable, and then store in a record. I have determined that this > issue is browser specific, and probably also related to the user's > configuration for language preferences etc. > > I have run out of ideas for tracking this down. My understanding of > the CGI interface is that it is supposed to force everything to UTF-8 > by default. It would be nice to be able to set some global options to > manage encodings on incoming queries. I suspect what is going on is > the texts are being sent in something besides UTF-8 but Python thinks > they are Unicode. > > Unfortunately, Python crashes when in a situation like this. It would > be far better if the CGI interface would make a best effort to deal > with the incoming text, inserting garbage characters where necessary. > That is "less damaging" than a outright failure, as most people can > live with an occasional [] in place of a tilde, etc. > > On this subject, there really needs to be better documentation on > Unicode, encoding conversions, etc. It is poorly documented in Python > also, and I am sure a lot of people are making the same mistakes in > trying to figure out what works and what breaks. > > Thanks, > > Brian McConnell --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to google-appengine@googlegroups.com To unsubscribe from this group, send email to google-appengine+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---