On 17/09/11 13:08, lists wrote:
I have been trying to learn how to parse XML with Python and learn how
to use xml.etree. Lots of the tutorials seem to be very long winded.
I'm trying to access a UK postcode API at www.uk-postcodes.com to take
a UK postcode and return the lat/lng of the postcode. This is what the
XML looks like: http://www.uk-postcodes.com/postcode/HU11AA.xml
The function below returns a dict with the xml tag as a key and the
text as a value. Is this a correct way to use xml.etree?
Define correct, does it give the desired result? Then I would say yes it
is correct. There may be alternative ways to get to the same result though.
def ukpostcodesapi(postcode):
import urllib
Why do the import here, for speed? You are reading an xml file from the
internet, guess where most of the time is spend in your function ;-).
import xml.etree.ElementTree as etree
baseURL='http://www.uk-postcodes.com/'
geocodeRequest='postcode/'+postcode+'.xml'
You could use string formatting here.
url = 'http://www.uk-postcodes.com/postcode/%s.xml' % postcode
Also what would happen if postcode includes a space?
#grab the xml
tree=etree.parse(urllib.urlopen(baseURL+geocodeRequest))
What happens if you get an error (a 404 error perhaps)? You might want
to add a try/except block around reading the xml from the internet.
root=tree.getroot()
results={}
for child in root[1]: #here's the geo tag
results.update({child.tag:child.text}) #build a dict containing
the
geocode data
return results
As you only get 1 set of long/lat tags in the xml you could use find().
See below an example.
from xml.etree import ElementTree as ET
import urllib2
url = 'http://www.uk-postcodes.com/postcode/HU11AA.xml'
xml = urllib2.urlopen(url).read()
tree = ET.XML(xml)
geo = {}
for leaf in tree.find('geo'):
geo[leaf.tag] = leaf.text
Greets
Sander
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor