Thanks for the responses.  Yeah, so the XML file is valid, it's just
that some of the tags have HTML embedded within them.  For Example:

<entry><p>This is text.</p></entry>

So Digestor seems this as:
entry/p

Rather than just entry.  I imagine I could just downloaded the XML
documents and knowing the structure, seach for the entry fields and
then cut out the text.  Then, store that separately.  I was just
hoping there was a way to list tags to ignore.  For example: <p>,
<br>, etc.

Thanks anyway,

On 7/27/06, rjn <[EMAIL PROTECTED]> wrote:
Hi Everyone,

I'm trying to write a Syndication Feed parser using Digester, however
I'm running into a stumbling block.  Many feeds have HTML in the
entries such as <a>, <br>, etc.   Digester tries to parse these as XML
tags, thus leading to blanks in the data I pull out.  I was wondering
if there was way to set Digester to ignore specific tags (in this
case, the HTML tags)?

Thanks,
RJ

--
em: [EMAIL PROTECTED]



--
em: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to