Re: textile mysql unicode and newforms
Hi Waylan I tried it the uicode branch on my local machine and it didn't make any difference. I'm still keeping it on my local box though. So, I tweaked the code a bit, so it looks like this at the top of the save overide now, not at the bottom and it's working as expected. if self.summary: self.summary = self.summary.encode('ascii', 'ignore') self.summary_html = textile.textile(self.summary) That way we simply don't pass the pasted apostrophes at all, and textile works, what i was after in the first place. This works on the production server that isn't running the unicode branch yet. Soon though, when we get some time to test. I think I might have some trouble when my boss tells me they want to use the same code base to do a welsh site though, lots of funny characters in there. And the country :-) On 19/06/07, Waylan Limberg <[EMAIL PROTECTED]> wrote: > > > I suspect the Unicode branch [1] addresses the issues you are having. > A few weeks back there was a call for testers as it is now feature > complete. I'd suggest giving that a try. > > [1]: http://code.djangoproject.com/wiki/UnicodeBranch > > > On 6/19/07, vanderkerkoff <[EMAIL PROTECTED]> wrote: > > > > Hello everyone > > > > I'm running django from the trunk, so using the most up to date > > version, python 2.5 with PyTextile 2.0.10. > > mysql5.0.2 with all settings to utf-8 and django content type is utf-8 > > > > I'm overwriting the save command on events using newforms, we're > > textiling the input for an html field, here's what I mean. > > > > def save(self): > > import textile > > if self.body: > > self.body_html = textile.textile(self.body) > > super(Event, self).save() > > > > > > it fails with this error > > > > Exception Value:'ascii' codec can't decode byte 0xb4 in position > 0: > > ordinal not in range(128) > > Exception Location: > /usr/local/lib/python2.5/site-packages/textile.py > > in glyphs, line 2418 > > > > My textile settings are > > # Set your encoding here. > > ENCODING = 'utf8' > > > > # Output? Non-ASCII characters will be automatically > > # converted to XML entities if you choose ASCII. > > OUTPUT = 'utf8' > > > > I tried changing my OUTPUT to ascii in textile but got the same error, > > so to me it looks like the form is sending a unicode > > series of bytes to textile which it can't understand. > > > > One way around this is to manipulate the self.summary prior to passing > > it to textile, like this. > > > > self.body = self.body.decode('utf-8') > > self.body = self.body.encode('ascii', 'ignore') > > > > This forces the passing of ascii to textile and it likes that alot, > > and works. > > > > But if a user now copies and pastes the dreaded apostrophe form word > > or another special character unique to word, > > it fails with this error. > > > > Exception Value:'ascii' codec can't encode character u'\u2019' > in > > position 5: ordinal not in range(128) > > Exception Location: /usr/local/lib/python2.5/encodings/utf_8.py in > > decode, line 16 > > > > > > If I run the super save earlier in the save definition after removing > > the textiling of the body section, and then > > call the data out of the database further down in the save definition, > > and then save it again like this > > > > e = Event.objects.get(id=new_id) > > if e.body: > > e.body_html = textile.textile(e.body) > > super(Event, e).save() > > > > It all works fine, no encoding or decoding needed for pasted > > apostrophes or anything. > > > > Here's the paste of the relevant part of the form with certain > > sections commented out so you can see what I mean. > > > > http://pastie.textmate.org/71702 > > > > I found this on the google groups form Ivan Sagalev > > To summarizes: your storage (a database) and your input/output (the > > web) > > really should use utf-8 to avoid problems with "strange" characters. > > If > > you deal internally with unicode (which newforms produce for you) then > > for now you should explicitly encode from it to utf-8 until Django > > starts doing it automatically. > > > > I've also been reading this thread on the google developers group, and > > I'm now completely confused as to what is going on. > > > > unicode issues in multiple tickets > > > > If anyone can tell me if there is some current status on this, or how > > it works right now I'd be really grateful. If I have to encode and > > decode then I > > don't mind, not much anyway :-) > > > > > > > > > > > > -- > > Waylan Limberg > [EMAIL PROTECTED] > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~---
Re: textile mysql unicode and newforms
I suspect the Unicode branch [1] addresses the issues you are having. A few weeks back there was a call for testers as it is now feature complete. I'd suggest giving that a try. [1]: http://code.djangoproject.com/wiki/UnicodeBranch On 6/19/07, vanderkerkoff <[EMAIL PROTECTED]> wrote: > > Hello everyone > > I'm running django from the trunk, so using the most up to date > version, python 2.5 with PyTextile 2.0.10. > mysql5.0.2 with all settings to utf-8 and django content type is utf-8 > > I'm overwriting the save command on events using newforms, we're > textiling the input for an html field, here's what I mean. > > def save(self): > import textile > if self.body: > self.body_html = textile.textile(self.body) > super(Event, self).save() > > > it fails with this error > > Exception Value:'ascii' codec can't decode byte 0xb4 in position 0: > ordinal not in range(128) > Exception Location: /usr/local/lib/python2.5/site-packages/textile.py > in glyphs, line 2418 > > My textile settings are > # Set your encoding here. > ENCODING = 'utf8' > > # Output? Non-ASCII characters will be automatically > # converted to XML entities if you choose ASCII. > OUTPUT = 'utf8' > > I tried changing my OUTPUT to ascii in textile but got the same error, > so to me it looks like the form is sending a unicode > series of bytes to textile which it can't understand. > > One way around this is to manipulate the self.summary prior to passing > it to textile, like this. > > self.body = self.body.decode('utf-8') > self.body = self.body.encode('ascii', 'ignore') > > This forces the passing of ascii to textile and it likes that alot, > and works. > > But if a user now copies and pastes the dreaded apostrophe form word > or another special character unique to word, > it fails with this error. > > Exception Value:'ascii' codec can't encode character u'\u2019' in > position 5: ordinal not in range(128) > Exception Location: /usr/local/lib/python2.5/encodings/utf_8.py in > decode, line 16 > > > If I run the super save earlier in the save definition after removing > the textiling of the body section, and then > call the data out of the database further down in the save definition, > and then save it again like this > > e = Event.objects.get(id=new_id) > if e.body: > e.body_html = textile.textile(e.body) > super(Event, e).save() > > It all works fine, no encoding or decoding needed for pasted > apostrophes or anything. > > Here's the paste of the relevant part of the form with certain > sections commented out so you can see what I mean. > > http://pastie.textmate.org/71702 > > I found this on the google groups form Ivan Sagalev > To summarizes: your storage (a database) and your input/output (the > web) > really should use utf-8 to avoid problems with "strange" characters. > If > you deal internally with unicode (which newforms produce for you) then > for now you should explicitly encode from it to utf-8 until Django > starts doing it automatically. > > I've also been reading this thread on the google developers group, and > I'm now completely confused as to what is going on. > > unicode issues in multiple tickets > > If anyone can tell me if there is some current status on this, or how > it works right now I'd be really grateful. If I have to encode and > decode then I > don't mind, not much anyway :-) > > > > > -- Waylan Limberg [EMAIL PROTECTED] --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: textile mysql unicode and newforms
Here's the add_event in my views which uses the newforms and calls the save. http://pastie.textmate.org/71706 It will probably help :-) On 19/06/07, vanderkerkoff <[EMAIL PROTECTED]> wrote: > > Hello everyone > > I'm running django from the trunk, so using the most up to date > version, python 2.5 with PyTextile 2.0.10. > mysql5.0.2 with all settings to utf-8 and django content type is utf-8 > > I'm overwriting the save command on events using newforms, we're > textiling the input for an html field, here's what I mean. > > def save(self): > import textile > if self.body: > self.body_html = textile.textile(self.body) > super(Event, self).save() > > > it fails with this error > > Exception Value:'ascii' codec can't decode byte 0xb4 in position > 0: > ordinal not in range(128) > Exception Location: /usr/local/lib/python2.5/site-packages/textile.py > in glyphs, line 2418 > > My textile settings are > # Set your encoding here. > ENCODING = 'utf8' > > # Output? Non-ASCII characters will be automatically > # converted to XML entities if you choose ASCII. > OUTPUT = 'utf8' > > I tried changing my OUTPUT to ascii in textile but got the same error, > so to me it looks like the form is sending a unicode > series of bytes to textile which it can't understand. > > One way around this is to manipulate the self.summary prior to passing > it to textile, like this. > > self.body = self.body.decode('utf-8') > self.body = self.body.encode('ascii', 'ignore') > > This forces the passing of ascii to textile and it likes that alot, > and works. > > But if a user now copies and pastes the dreaded apostrophe form word > or another special character unique to word, > it fails with this error. > > Exception Value:'ascii' codec can't encode character u'\u2019' in > position 5: ordinal not in range(128) > Exception Location: /usr/local/lib/python2.5/encodings/utf_8.py in > decode, line 16 > > > If I run the super save earlier in the save definition after removing > the textiling of the body section, and then > call the data out of the database further down in the save definition, > and then save it again like this > > e = Event.objects.get(id=new_id) > if e.body: > e.body_html = textile.textile(e.body) > super(Event, e).save() > > It all works fine, no encoding or decoding needed for pasted > apostrophes or anything. > > Here's the paste of the relevant part of the form with certain > sections commented out so you can see what I mean. > > http://pastie.textmate.org/71702 > > I found this on the google groups form Ivan Sagalev > To summarizes: your storage (a database) and your input/output (the > web) > really should use utf-8 to avoid problems with "strange" characters. > If > you deal internally with unicode (which newforms produce for you) then > for now you should explicitly encode from it to utf-8 until Django > starts doing it automatically. > > I've also been reading this thread on the google developers group, and > I'm now completely confused as to what is going on. > > unicode issues in multiple tickets > > If anyone can tell me if there is some current status on this, or how > it works right now I'd be really grateful. If I have to encode and > decode then I > don't mind, not much anyway :-) > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
textile mysql unicode and newforms
Hello everyone I'm running django from the trunk, so using the most up to date version, python 2.5 with PyTextile 2.0.10. mysql5.0.2 with all settings to utf-8 and django content type is utf-8 I'm overwriting the save command on events using newforms, we're textiling the input for an html field, here's what I mean. def save(self): import textile if self.body: self.body_html = textile.textile(self.body) super(Event, self).save() it fails with this error Exception Value:'ascii' codec can't decode byte 0xb4 in position 0: ordinal not in range(128) Exception Location: /usr/local/lib/python2.5/site-packages/textile.py in glyphs, line 2418 My textile settings are # Set your encoding here. ENCODING = 'utf8' # Output? Non-ASCII characters will be automatically # converted to XML entities if you choose ASCII. OUTPUT = 'utf8' I tried changing my OUTPUT to ascii in textile but got the same error, so to me it looks like the form is sending a unicode series of bytes to textile which it can't understand. One way around this is to manipulate the self.summary prior to passing it to textile, like this. self.body = self.body.decode('utf-8') self.body = self.body.encode('ascii', 'ignore') This forces the passing of ascii to textile and it likes that alot, and works. But if a user now copies and pastes the dreaded apostrophe form word or another special character unique to word, it fails with this error. Exception Value:'ascii' codec can't encode character u'\u2019' in position 5: ordinal not in range(128) Exception Location: /usr/local/lib/python2.5/encodings/utf_8.py in decode, line 16 If I run the super save earlier in the save definition after removing the textiling of the body section, and then call the data out of the database further down in the save definition, and then save it again like this e = Event.objects.get(id=new_id) if e.body: e.body_html = textile.textile(e.body) super(Event, e).save() It all works fine, no encoding or decoding needed for pasted apostrophes or anything. Here's the paste of the relevant part of the form with certain sections commented out so you can see what I mean. http://pastie.textmate.org/71702 I found this on the google groups form Ivan Sagalev To summarizes: your storage (a database) and your input/output (the web) really should use utf-8 to avoid problems with "strange" characters. If you deal internally with unicode (which newforms produce for you) then for now you should explicitly encode from it to utf-8 until Django starts doing it automatically. I've also been reading this thread on the google developers group, and I'm now completely confused as to what is going on. unicode issues in multiple tickets If anyone can tell me if there is some current status on this, or how it works right now I'd be really grateful. If I have to encode and decode then I don't mind, not much anyway :-) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: UNICODE and newforms
On 1/11/07, Adrian Holovaty <[EMAIL PROTECTED]> wrote: > > On 1/10/07, Honza Král <[EMAIL PROTECTED]> wrote: > > 1) migrate django to unicode aka big bang approach > > + cleanest > > - most work > > - may break some code > > > > 2) migrate newforms to normal byte strings > > + doesn't break anything > > - waste of good work > > - not ellegant (never give in, never surrender ;) ) > > > > 3) update the rest of django to correctly detect and de/encode the > > strings using settings.DEFAULT_ENCODING where needed > > - lot of work that will be thrown away later > > - messy > > I'm a huge proponent of moving all of Django to use Unicode strings > internally. The newforms library is my first experiment in making > everything Unicode, in (to my knowledge) the cleanest way possible: It > accepts bytestrings *or* Unicode objects for all input, and it always > outputs Unicode. > > My goal is to see how the everything-is-Unicode approach goes, and it > has been pretty good so far. As encoding/decoding bugs have been > detected, we've fixed them pretty promptly. > > If the newforms library does not handle "funky character" input > correctly in *any* case, please report that as a bug immediately, and > I or one of the many great patch contributors will fix it. We've > already done this in several cases (tickets #3266, #3153, #3008), and > the library already is quite resilient when it comes to "funky" input. yes, but the rest of django is not as resilient to funky input, so you need an extra layer to put between neforms and the rest, so that you can deal with the funky yourself - template filters for example don't handle unicode well... > > Regarding moving *all* of Django to use Unicode strings internally, > that's a separate issue. For now, let's focus on making newforms as > good as possible, learning from whatever mistakes we make. Once > newforms is "done," we'll be in good shape to move the rest of Django > to Unicode. As I mentioned earlier - newforms are fine, its about the rest of the system - we have to modify the rest of django to be able to cope with the unicode outputted by the newforms library, if we do that, the full unicode support seems only a step away... > > Finally, there's one more force at work here, of which people should > be aware. We should release Django 1.0 sooner rather than later, and > while making big changes (such as moving the whole framework to > Unicode) is certainly worthwhile, each big change like this pushes > back 1.0. Let's balance the perfectionism with the deadlines. No argument here > > Adrian > > -- > Adrian Holovaty > holovaty.com | djangoproject.com > > > > -- Honza Král E-Mail: [EMAIL PROTECTED] ICQ#: 107471613 Phone: +420 606 678585 --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: UNICODE and newforms
On 1/10/07, Honza Král <[EMAIL PROTECTED]> wrote: > 1) migrate django to unicode aka big bang approach > + cleanest > - most work > - may break some code > > 2) migrate newforms to normal byte strings > + doesn't break anything > - waste of good work > - not ellegant (never give in, never surrender ;) ) > > 3) update the rest of django to correctly detect and de/encode the > strings using settings.DEFAULT_ENCODING where needed > - lot of work that will be thrown away later > - messy I'm a huge proponent of moving all of Django to use Unicode strings internally. The newforms library is my first experiment in making everything Unicode, in (to my knowledge) the cleanest way possible: It accepts bytestrings *or* Unicode objects for all input, and it always outputs Unicode. My goal is to see how the everything-is-Unicode approach goes, and it has been pretty good so far. As encoding/decoding bugs have been detected, we've fixed them pretty promptly. If the newforms library does not handle "funky character" input correctly in *any* case, please report that as a bug immediately, and I or one of the many great patch contributors will fix it. We've already done this in several cases (tickets #3266, #3153, #3008), and the library already is quite resilient when it comes to "funky" input. Regarding moving *all* of Django to use Unicode strings internally, that's a separate issue. For now, let's focus on making newforms as good as possible, learning from whatever mistakes we make. Once newforms is "done," we'll be in good shape to move the rest of Django to Unicode. Finally, there's one more force at work here, of which people should be aware. We should release Django 1.0 sooner rather than later, and while making big changes (such as moving the whole framework to Unicode) is certainly worthwhile, each big change like this pushes back 1.0. Let's balance the perfectionism with the deadlines. Adrian -- Adrian Holovaty holovaty.com | djangoproject.com --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: UNICODE and newforms
> +1 for unicode in django +1 Regards, Arthur. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: UNICODE and newforms
Am 11.01.2007 um 09:39 schrieb Gábor Farkas: > but that all happened approx. 5months ago, and the branches are still > not merged (because they are not finished yet), and it does not seem > that they will be merged in the near future... +1 for unicode in django, including http://code.djangoproject.com/ ticket/952 (and other?) Is it just a decision between unicodification and other branches for now? --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: UNICODE and newforms
Honza Král wrote: > Hi there, > > I have been playing around with newforms for some time. since I don't > come from an English speaking country, I did also put the unicode > stuff to the test. > > It has bitten me on several occasions. I agree that I am not used to > python unicode strings, but that doesn't make it smaller problem: > I was bitten by unicode in views, backend, form definitions and even > templates. > for example try: > > {{ form.field.errors|join:"" }} > > when using a non-ascii characters... > > It simply doesn't work well, and I doubt it ever will work 100% - > mixing unicode and bytestrings. > > Please do not take me wrong - I still love newforms, I think its the > right way to go. > > I see several ways how to solve this problem: > > 1) migrate django to unicode aka big bang approach > + cleanest > - most work > - may break some code there was a plan to migrate django to unicode (i once created some simple patches...), but then it was decided, that there are too many branches currently, so that first some branches should be closed (merged back into the trunk), and then a unicode-branch can be created. but that all happened approx. 5months ago, and the branches are still not merged (because they are not finished yet), and it does not seem that they will be merged in the near future... gabor --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: UNICODE and newforms
+1 from me. I'm german speaking, having ümläüts all over the place. I stumbled over the following post that refers to a few tickets whose patches proved helpful http://maurus.net/weblog/2006/08/08/utf8-encoded-unicode-support/ greets Philipp On Wed, Jan 10, 2007 at 11:56:52PM -0800, Ville Säävuori wrote: > > > I see several ways how to solve this problem: > > > > 1) migrate django to unicode aka big bang approach > > +1 from me. > > I'm also working with Django with a language that has many "funky > characters". I, too, am not very experienced with Python and unicode, > but it's still a major pain sometimes. It would be _very_ nice to have > unicode troughout the framework. > > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: UNICODE and newforms
> I see several ways how to solve this problem: > > 1) migrate django to unicode aka big bang approach +1 from me. I'm also working with Django with a language that has many "funky characters". I, too, am not very experienced with Python and unicode, but it's still a major pain sometimes. It would be _very_ nice to have unicode troughout the framework. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
UNICODE and newforms
Hi there, disclaimer: I am not very familiar with python's unicode implementation, it may be that I am overreacting, but after solving one unicode issue after another for the past week when I desperately need to finish my project hasn't exactly improved my patience. So if anybody sees a solution I missed, PLEASE let me know. I have been playing around with newforms for some time. since I don't come from an English speaking country, I did also put the unicode stuff to the test. It has bitten me on several occasions. I agree that I am not used to python unicode strings, but that doesn't make it smaller problem: I was bitten by unicode in views, backend, form definitions and even templates. for example try: {{ form.field.errors|join:"" }} when using a non-ascii characters... It simply doesn't work well, and I doubt it ever will work 100% - mixing unicode and bytestrings. Please do not take me wrong - I still love newforms, I think its the right way to go. I see several ways how to solve this problem: 1) migrate django to unicode aka big bang approach + cleanest - most work - may break some code 2) migrate newforms to normal byte strings + doesn't break anything - waste of good work - not ellegant (never give in, never surrender ;) ) 3) update the rest of django to correctly detect and de/encode the strings using settings.DEFAULT_ENCODING where needed - lot of work that will be thrown away later - messy I am willing to help. thanks for your comments/experiences -- Honza Král E-Mail: [EMAIL PROTECTED] ICQ#: 107471613 Phone: +420 606 678585 --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---