Mike wrote:

> Hi, I am using Python to scrape web pages and I do not have problem 
> unless I run into a site that is utf-8.  It seems & is changed to 
> & when the site is utf-8.
>       [...]

> Any ideas?

How about using the universal feedparser from feedparser.org to fetch 
and parse the RSS from Reuters?  That's what I do and it works like a 


>>> import feedparser
>>> rss = feedparser.parse('http://today.reuters.com/rss/topNews')
>>> for what in ('link', 'title', 'summary'):
...     print rss.entries[0][what]
...     print

Top court seems closely divided on suicide law

During arguments, the justices sharply questioned both sides on whether 
then-Attorney General John Ashcroft had the power under federal law in 2001 to 
bar distribution of controlled drugs to assist suicides, regardless of state 



Klaus Alexander Seistrup
Magnetic Ink, Copenhagen, Denmark

Reply via email to