Abdera chokes parsing content with type of html

Dan Beaulieu Fri, 26 Mar 2010 10:10:43 -0700

Hi All, I am evaluating a few java atom parsers for a project. I am trying to 
parse a sample, seen here -> http://pastebin.org/124779


, that is pulled from the wordpress stream. As you can see the content tag has 
attribute type with value html, but the html isn't encoded. Abdera doesn't like 
this. It fails with error 

 

com.ctc.wstx.exc.WstxParsingException: Unexpected close tag </content>; 
expected </BR>.

 at [row,col {unknown-source}]: [24,528]

 

Is there any way to make abdera lenient when it comes to invalid xml? While I 
appreciate standards, I am in no position to change the WordPress stream. 

 

For a simple test to replicate here is all I am doing:

 

// create abdera and input stream from sample above.

Document<Entry> doc = abdera.parse(is);

Entry feed = doc.getRoot();

System.out.println(feed.getContent()); ß It fails here.

Abdera chokes parsing content with type of html

Reply via email to