I'm dealing with XML files in which there are lots of tags of the following form: <a><b>x</b><c>y</c></a> (all of these letters are being used as 'metalinguistic variables') Not all of the tags in the file are of that form, but that's the only type of tag I'm interested in. (For the insatiably curious, I'm talking about a conversation log from MSN Messenger.) What I need to do is to pull out all the x's and y's in a form I can use. In other words, from...
. . <a><b>x1</b><c>y1</c></a> . . <a><b>x2</b><c>y2</c></a> . . <a><b>x3</b><c>y3</c></a> . . ...I would like to produce, for example,... [ (x1,y1), (x2,y2), (x3,y3) ] Now, I'm aware that there are extensive libraries for dealing with marked-up text, but here's the thing: I think I have a reasonable understanding of python, but I use it in a lisplike way, and in particular I only know the rudiments of how classes work. So here's what I'm asking for: Can anybody give me a rough idea how to come to grips with the problem described above? Or even (dare to dream) example code? Any help will be very much appreciated. Peace, STM -- http://mail.python.org/mailman/listinfo/python-list