Thanks for the responses. Yeah, so the XML file is valid, it's just that some of the tags have HTML embedded within them. For Example:
<entry><p>This is text.</p></entry> So Digestor seems this as: entry/p Rather than just entry. I imagine I could just downloaded the XML documents and knowing the structure, seach for the entry fields and then cut out the text. Then, store that separately. I was just hoping there was a way to list tags to ignore. For example: <p>, <br>, etc. Thanks anyway, On 7/27/06, rjn <[EMAIL PROTECTED]> wrote:
Hi Everyone, I'm trying to write a Syndication Feed parser using Digester, however I'm running into a stumbling block. Many feeds have HTML in the entries such as <a>, <br>, etc. Digester tries to parse these as XML tags, thus leading to blanks in the data I pull out. I was wondering if there was way to set Digester to ignore specific tags (in this case, the HTML tags)? Thanks, RJ -- em: [EMAIL PROTECTED]
-- em: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]