Hi,

Hoping that some of you won't mind taking a peek at my code and sharing your
 thoughts.  I just started using the elementtree module yesterday to work
 with xml files.  Here's an example of some xml code I might be parsing:

============================================================
<data>
        <fonts>
                <fontData embed="true" name="Times" />
                <fontData embed="true" name="Arial" />
        </fonts>
        <color>
        </color>
        <template>
                <fonts>
                        <fontData>
                                <fontData embed="true" 
name="Courier">text</fontData>
                        </fontData>
                        <fontData embed="true" name="Helvetica" />
                </fonts>
                <width>
                </width>
        </template>
</data>
============================================================

What I'd like to do is get the attribute 'name' from the 2nd set 'fontData' 
tags.  So what I'd end up with is ['Courier', 'Helvetica'].  Here's the first 
lines of code I tested:

============================================================
from xml.etree import ElementTree as ET

tree = ET.parse('/Users/jay/Desktop/test.txt')

root = tree.getroot()

fntList = []
f = root.getiterator('fonts')
n = f[-1].getiterator('fontData')
for i in n:
    i = i.get('name')
    if i != None: fntList.append(i)

print fntList
============================================================

This gives me ['Courier', 'Helvetica'] which is what I'm wanting.  If I'm 
understanding this correctly, it seems getiterator('fonts') will get both of 
the 2 sections of <fonts> tags.  Since I only want the second section, which is 
the last, I look at f[-1] and use the getiterator('fontData') in order to 
search through all the appropriate tags.  Looks like getiterator also finds all 
nested tags as seen above when it grabbed both font names I was wanting.

So I continued to experiment a bit and came up with this next:

============================================================
from xml.etree import ElementTree as ET

tree = ET.parse('/Users/jay/Desktop/test.txt')

root = tree.getroot()

fntList = []
f = root.getiterator('fonts')
n = f[-1].find('fontData')
for i in n:
    fntList.append(i.get('name'))

print fntList
============================================================

This code just gave me ['Courier'].  Now if I change 'find' to 'findall' then 
I'll get [None, 'Helvetica'].  Not exactly sure what exactly that's doing.  
Seems the 'findall' searches through the tags that aren't nested, but then just 
using 'find' found the first nested 'name'.

Anyway, I'm hoping someone might tell me if the first example of code above is 
a decent way to parse xml files.  I'm still new to Python and am looking for 
good code structure as well as accurate examples.

--

One other question I had was about rounding floats.  I was first looking at 
this syntax to round out to 6 decimal places if needed:

>>> f = '508.5'
>>> x = '%.6f' % (float(f)/72)
>>> x
'7.062500'

However, in this instance I don't want the last 2 zeroes.  So would it be 
better to do something like this:

>> f = '508.5'
>>> x = round(float(f)/72, 6)
>>> x
7.0625

I've been reading a bit about some rounding bugs, but am really not that 
knowledgeable about the subject.  Does anyone have a preference of how they 
like to round as well as the result they see?

Thanks for looking at my questions.

Jay


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to