> > Since you mentioned SimpleXML, Kyle, I assume you're using PHP? >
Actually I'm using perl. For reasons not related to XML parsing, it is the preferred (but not mandatory) language. Based on a few tests and manual inspection, it looks like the ticket for me is going have a two stage process where the first stage converts the file to valid XML and the second cuts through it with SAX. Originally, I was trying to avoid SAX, but the process has been prettier than expected so far. The XML has not been prettier than expected -- it contains a number of issues including outright invalid XML, invalid characters, and hand coded HTML within some elements (i.e. string data not encoded as such). Gotta love library data. But screwed up stuff is employment security. If things actually worked, I'd be redundant... kyle