On 2014-01-05 08:02, eryksun wrote:
On Sun, Jan 5, 2014 at 2:57 AM, Alex Kleider <aklei...@sonic.net> wrote:
def ip_info(ip_address):

    response =  urllib2.urlopen(url_format_str %\
                                   (ip_address, ))
    encoding = response.headers.getparam('charset')
    print "'encoding' is '%s'." % (encoding, )
    info = unicode(response.read().decode(encoding))

decode() returns a unicode object.

    n = info.find('\n')
    print "location of first newline is %s." % (n, )
    xml = info[n+1:]
    print "'xml' is '%s'." % (xml, )

    tree = ET.fromstring(xml)
    root = tree.getroot()   # Here's where it blows up!!!
    print "'root' is '%s', with the following children:" % (root, )
    for child in root:
        print child.tag, child.attrib
    print "END of CHILDREN"
    return info

Danny walked you through the XML. Note that he didn't decode the
response. It includes an encoding on the first line:

    <?xml version="1.0" encoding="ISO-8859-1" ?>

Leave it to ElementTree. Here's something to get you started:

    import urllib2
    import xml.etree.ElementTree as ET
    import collections

    url_format_str = 'http://api.hostip.info/?ip=%s&position=true'
    GML = 'http://www.opengis.net/gml'
    IPInfo = collections.namedtuple('IPInfo', '''

    def ip_info(ip_address):
        response = urllib2.urlopen(url_format_str %
        tree = ET.fromstring(response.read())
        hostip = tree.find('{%s}featureMember/Hostip' % GML)
        ip = hostip.find('ip').text
        city = hostip.find('{%s}name' % GML).text
        country = hostip.find('countryName').text
        coord = hostip.find('.//{%s}coordinates' % GML).text
        lon, lat = coord.split(',')
        return IPInfo(ip, city, country, lat, lon)

    >>> info = ip_info('')
    >>> info.ip
    >>> info.city, info.country
    (u'Bogot\xe1', 'COLOMBIA')
    >>> info.latitude, info.longitude
    ('10.4', '-75.2833')

This assumes everything works perfect. You have to decide how to fail
gracefully for the service being unavailable or malformed XML
(incomplete or corrupted response, etc).

Thanks again for the input.
You're using some ET syntax there that would probably make my code much more readable but will require a bit more study on my part.

I was up all night trying to get this sorted out and was finally successful.
(Re-) Reading 'joelonsoftware' and some of the Python docs helped.
Here's what I came up with (still needs modification to return a dictionary, but that'll be trivial.)

alex@x301:~/Python/Parse$ cat ip_xml.py
#!/usr/bin/env python
# vim: set fileencoding=utf-8 :
# -*- coding : utf-8 -*-
# file: 'ip_xml.py'

import urllib2
import xml.etree.ElementTree as ET

url_format_str = \

def ip_info(ip_address):
    response =  urllib2.urlopen(url_format_str %\
                                   (ip_address, ))
    encoding = response.headers.getparam('charset')
    info = response.read().decode(encoding)
    # <info> comes in as <type 'unicode'>.
    n = info.find('\n')
    xml = info[n+1:]  # Get rid of a header line.
    # root = ET.fromstring(xml) # This causes error:
    # UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1'
    # in position 456: ordinal not in range(128)
    root = ET.fromstring(xml.encode("utf-8"))
    # This is the part I still don't fully understand but would
    # probably have to look at the library source to do so.
    info = []
    for i in range(4):

    return info

if __name__ == "__main__":
    info = ip_info("")
    print info
    print info[1]

alex@x301:~/Python/Parse$ ./ip_xml.py
['', u'Bogot\xe1', 'COLOMBIA', 'CO', '-75.2833,10.4']

Thanks to all who helped.
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:

Reply via email to