Re: urlencode in template gives unexpected result (for me :-)
On Thu, Aug 25, 2011 at 7:23 AM, Michel30wrote: > Thanks Tom that clarifies a lot, learning every day. > > My filesystem is ext4, encoding is irrelevant here right? > So, I guess the best thing to do is to convert my database into utf-8 > using a method as described here: > http://www.bothernomore.com/2008/12/16/character-encoding-hell/ > > That way I'm consistently using utf-8. > Would this also be backwards compatible with my legacy app? I don't > see it using any encoding specific. > > Thanks, > Michel > Encoding is always relevant. Your filesystem will treat the filename as just a series of bytes, but what those bytes are depends upon the character encoding of the application that created the files. I'm not sure how this will be displayed via email, but an example of a file created with a latin1 name, and then attempted to be opened with the equivalent unicode name: >>> filename=u'£££' >>> fp=open(filename.encode('latin1'), 'w+') >>> fp.close() >>> fp=open(filename.encode('utf-8'), 'r') Traceback (most recent call last): File "", line 1, in IOError: [Errno 2] No such file or directory: '\xc2\xa3\xc2\xa3\xc2\xa3' >>> os.listdir('.') ['\xa3\xa3\xa3'] \xa3 is the encoding of the '£' symbol in latin1, \xc2\xa3 is the encoding of the same symbol in UTF-8. Cheers Tom -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: urlencode in template gives unexpected result (for me :-)
Thanks Tom that clarifies a lot, learning every day. My filesystem is ext4, encoding is irrelevant here right? So, I guess the best thing to do is to convert my database into utf-8 using a method as described here: http://www.bothernomore.com/2008/12/16/character-encoding-hell/ That way I'm consistently using utf-8. Would this also be backwards compatible with my legacy app? I don't see it using any encoding specific. Thanks, Michel -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: urlencode in template gives unexpected result (for me :-)
On Wed, Aug 24, 2011 at 2:50 PM, Michel30wrote: > > > On Aug 24, 3:22 pm, Tom Evans wrote: >> On Wed, Aug 24, 2011 at 1:47 PM, Michel30 wrote: >> > Hi all, >> >> > I have written an application using Django 1.3 , apache2 and a mysql >> > db. >> > I'm using the db to store filepaths and filenames for legacy purposes >> > while serving them to users with apache. >> >> > Now mysql is using latin-1 (with the filenames most likely stored in >> > CP-1252) while Django uses utf-8. >> >> That is not going to fly. You will likely need to ensure you have a >> consistent character encoding across your website, database and file >> system. >> >> Cheers >> >> Tom > > Tom, > > that looks like it would be best, yes (this is my first exposure to > encoding problems) > > I cannot change the filesystem or mysql encoding since the legacy > application is still using it. I assumed that with utf-8 I would be > good as it covers all(?) and I understood mysql translates itself from > latin-1 to utf-8 and vice versa. > > As far as I can see this only hurts my hyperlinks, more specifically > only file.filename so wouldn't translating only these work? > Trusting mysql to DTRT with character encoding does not work well in my experience. For starters, if your database is latin1, there is a huge range of UTF-8 characters that cannot be encode to latin1. If your website is presented in UTF-8, as is default for Django, then input submitted by your users will be in UTF-8 as well, and quite easily cannot be stored in the database. Many browsers will submit \u2019 - ’ - instead of a simple ' character, which will not fit in latin1. When it comes to serving your files, Apache url-decodes your request, it doesn't assume anything about the character encoding of the bytes after that and will simply open that file system location path. If your files are stored in the file system with latin1 names, that means the requested file name must be encoded in latin1. So sure, you could latin1 encode each filename, and then urlencode the result. You are opening yourself up for a world of pain by not using consistent character encodings. It will hurt you eventually. Cheers Tom -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: urlencode in template gives unexpected result (for me :-)
On Aug 24, 3:22 pm, Tom Evanswrote: > On Wed, Aug 24, 2011 at 1:47 PM, Michel30 wrote: > > Hi all, > > > I have written an application using Django 1.3 , apache2 and a mysql > > db. > > I'm using the db to store filepaths and filenames for legacy purposes > > while serving them to users with apache. > > > Now mysql is using latin-1 (with the filenames most likely stored in > > CP-1252) while Django uses utf-8. > > That is not going to fly. You will likely need to ensure you have a > consistent character encoding across your website, database and file > system. > > Cheers > > Tom Tom, that looks like it would be best, yes (this is my first exposure to encoding problems) I cannot change the filesystem or mysql encoding since the legacy application is still using it. I assumed that with utf-8 I would be good as it covers all(?) and I understood mysql translates itself from latin-1 to utf-8 and vice versa. As far as I can see this only hurts my hyperlinks, more specifically only file.filename so wouldn't translating only these work? -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
Re: urlencode in template gives unexpected result (for me :-)
On Wed, Aug 24, 2011 at 1:47 PM, Michel30wrote: > Hi all, > > I have written an application using Django 1.3 , apache2 and a mysql > db. > I'm using the db to store filepaths and filenames for legacy purposes > while serving them to users with apache. > > Now mysql is using latin-1 (with the filenames most likely stored in > CP-1252) while Django uses utf-8. > That is not going to fly. You will likely need to ensure you have a consistent character encoding across your website, database and file system. Cheers Tom -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.
urlencode in template gives unexpected result (for me :-)
Hi all, I have written an application using Django 1.3 , apache2 and a mysql db. I'm using the db to store filepaths and filenames for legacy purposes while serving them to users with apache. Now mysql is using latin-1 (with the filenames most likely stored in CP-1252) while Django uses utf-8. I generate the links to the files thusly in my template: {{ file.filename }} This works until I have funky character, lets say File….pdf Then my hyperlink reads: File….pdf While Apache throws a 404 with: NotFound /path/File….pdf Obviously because it expects this link: File….pdf Any ideas how to fix this in the template? Thanks -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.