On Jan 21, 11:25 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > En Mon, 21 Jan 2008 18:38:48 -0200, Arne <[EMAIL PROTECTED]> escribi�: > > > > > On 21 Jan, 19:15, Bruno Desthuilliers <bruno. > > [EMAIL PROTECTED]> wrote: > > >> This should not prevent you from learning how to properly parse XML > >> (hint: with an XML parser). XML is *not* a line-oriented format, so you > >> just can't get nowhere trying to parse it this way. > > >> HTH > > > Do you think i should use xml.dom.minidom for this? I've never used > > it, and I don't know how to use it, but I've heard it's useful. > > > So, I shouldn't use this techinicke (probably wrong spelled) trying to > > parse XML? Should i rather use minidom? > > > Thank you for for answering, I've learnt a lot from both of you, > > Desthuilliers and Genellina! :) > > Try ElementTree instead; there is an implementation included with Python > 2.5, documentation athttp://effbot.org/zone/element.htmand another > implementation available athttp://codespeak.net/lxml/ > > import xml.etree.cElementTree as ET > import urllib2 > > rssurl = 'http://www.jabber.org/news/rss.xml' > rssdata = urllib2.urlopen(rssurl).read() > rssdata = rssdata.replace('&', '&') # ouch! > > tree = ET.fromstring(rssdata) > for item in tree.getiterator('item'): > print item.find('link').text > print item.find('title').text > print item.find('description').text > print > > Note that this particular RSS feed is NOT a well formed XML document - I > had to replace the & with & to make the parser happy. > > -- > Gabriel Genellina
This look very interesting! But it looks like that no documents is well-formed! I've tried several RSS-feeds, but they are eighter "undefined entity" or "not well-formed". This is not how it should be, right? :) -- http://mail.python.org/mailman/listinfo/python-list