I was making the output both a Node and String..

Defaulted to a Node.

so <wellFormedHtml dataObjectType="String.class"/>


@XmlRootElement(name = "wellFormedHtml")
@XmlAccessorType(XmlAccessType.FIELD)
public class WellFormedHtmlDataFormat extends DataFormatType {

    @XmlAttribute(required = false)
    private Class dataObjectType;

    public WellFormedHtmlDataFormat(Class<?> dataObjectType) {

super("org.apache.camel.dataformat.tagsoup.WellFormedHtmlDataFormat");
        assert dataObjectType.isAssignableFrom(String.class)
               || dataObjectType.isAssignableFrom(Node.class) :
"WellFormedHtmlDataFormat only supports returning a String or a
org.w3c.dom.Node object";
               this.dataObjectType = dataObjectType;
    }


On Wed, Dec 10, 2008 at 21:12, James Strachan <[EMAIL PROTECTED]>wrote:

> 2008/12/10 Ramon Buckland <[EMAIL PROTECTED]>:
> > Hi Peoples,
> >
> > I am just about finished the proof of concept of using TagSoup as a
> > DataFormat and as a component.
> >
> > For those not familiar with TagSoup, it is a Java Library (APache 2.0
> > License) which converts poorly formatted Html
> >
> > <html> <p> something
> >
> > into well formed (xml) HTML. (not XHTML).
> >
> > ie:
> >
> > <html>
> >    <body>
> >            <p>something</p>
> >    </body>
> > </html>
> >
> > This is very helpful for a following reason.
> >
> >  <camelContext xmlns="http://activemq.apache.org/camel/schema/spring";>
> >  <route>
> >    <from uri="direct:start"/>
> >    <to uri="http://myserver.com/somequery?foo=1"/>
> >    <unmarshal><wellFormedHtml/><unmarshal>
> >    <to uri="xslt:file:///foo/bar.xsl"/>
> >    <to .../>
> >  </route>
> > </camelContext>
> >
> >
> > Questions:
> >    Is this component helpful ? *Should I finish, I have not seen anything
> > like it in the toolkit yet)
>
> Definitely! Being able to format HTML nicely as XML so you can do
> XPath and whatnot is *very* useful!
>
>
> >    *If continuing is a good idea, what should the "dataFormat" be called
> ?
> > ie the <wellFormedHtml/>
>
> Oooh thats a tricky one - naming is so hard! Maybe <tagSoup/> ? We
> might one day have a few different mechanisms? (e.g. jtidy?).
>
> Though maybe tagSoup is a bit vague :). How about tidyHtml or tidyMarkup?
>
>
> >    Am I unmarshalling or marshalling ? (we of course won't support going
> > the other way as good to bad html is just hard(er))
> >    I figured it is <unmarshalling> as the <csv/> dataformat is similar,
> CSV
> > --> List<..> is ummarshalling.
>
> Yeah. Whats the output btw - is it a DOM? Or can it be converted to a
> Source so the endpoint could take DOM/SAX/StaX etc?
>
>
> --
> James
> -------
> http://macstrac.blogspot.com/
>
> Open Source Integration
> http://fusesource.com/
>

Reply via email to