------------------------------------------------ On Fri, 21 Feb 2003 14:16:11 -0000, "Vincent O' Keeffe" <[EMAIL PROTECTED]> wrote:
> Hi there, > > I'm downloading xml files from a mail server using POP3Client. I'm saving these >mails as files on a directory. I also parse out certain details from the XML file to >include in the filename. > > The only problem is that, when I grab the body of the message using POP3Client's >method, it includes mail-related information above and below the actual XML tags. > > ------=_Part_15_1895070.1044374870502 > Content-Type: text/plain > Content-Transfer-Encoding: 7bit > > <?xml version="1.0"?> > ... > > </Order> > ------=_Part_15_1895070.1044374870502-- > > > So, I need to remove everything before the opening <?xml string and, again, >everything after the closing </Order> tag. I thought about stripping out the first 4 >and last 4 lines of the file but the messages sometimes arrive clean, and sometimes >with this extra info. > > I've trawled newsgroups and the web and haven't been able to come up with any >answers. > > Does anyone have any idea of how to go about this if I assign the body to a variable >like so? > > $msgbody = $pop->Body($i) # $pop being the instantiated POP connection object > The extra lines are MIME multipart boundaries+headers. So you could do the typical, check the Content-Type of the main message, if it is multipart then strip the boundary, then only match everything between boundaries after the first blank line for multiparts, and do regular parsing on non-multiparts... ick... you might try checking into the various MIME modules, or consider using a POP3 client module that will handle multiparts, or check to see if POP3Client will do so, one example of a set of modules that does is Mail::Box but it is very complex and thorough so may be overkill for you.... The other option is to scrap the mail handling completely and just look for the start of the XML since it is *supposed* to be proper. Aka you should be able to look for a doctype or the first element, and then you know when to start processing, and given the first element you will know when to stop processing. http://danconia.org -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]