Hi -

I'm scraping some information from a website, but I'm having some
trouble with special characters such as é. I'm using BeautifulSoup for
the scraping, and would like to be able to have Django print out muy
strings correctly (on the template, in the shell, in the admin).

The way I go about it is:

>>> from BeautifulSoup import BeautifulSoup, BeautifulStoneSoup
>>> html = "André goes to town"
>>> soup = BeautifulSoup(html)
>>> soup
Andr‚ goes to town
>>> soup = BeautifulSoup(html, convertEntities=BeautifulStoneSoup.HTML_ENTITIES)
>>> soup
Traceback (most recent call last):
  File "<console>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201a' in
position 4: ordinal not in range(128)
>>> soup.contents
[u'Andr\u201a goes to town']
>>> soup.contents[0]
u'Andr\u201a goes to town'

>>> from myapp.events.models import Event
>>> e = Event(title = soup.contents[0])
>>> e.save()
>>> e.name
u'Andr\u201a goes to town'

But, as you see, the unicode does not get translated. What steps
should I take in order to make sure my strings are saved (and later
displayed) correctly?

Many thanks in advance,

Mathieu

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to