Thank you Al and Abhishake!

The JTidy library is what I was looking for. I wanted to convert an external HTML page to a org.w3c.dom.Document object and JTidy does exactly that.

Now I want to put some kind of wrapper around the Document object so I can work with it in minilang.

-Adrian

Abhishake Agarwal wrote:
Hello Adrian,

I don't know whether ofbiz has this, but I have done similar thing using a
API called html parser. you can search it on google.

Regards,
Abhishake

On Tue, Jul 8, 2008 at 10:50 PM, Al Byers <[EMAIL PROTECTED]>
wrote:

Adrian,

In the past I have used JTidy to make sure it is in XHTML and then wrote
Freemarker scripts to process the markup. I find FM to be easier to use
than
XSLT because it has a loop index var and it is easier to connect it with
Java classes that you may wish to write to help in the processing.

-Al

On Tue, Jul 8, 2008 at 10:53 AM, Adrian Crum <[EMAIL PROTECTED]> wrote:

I need OFBiz to gather data from external websites - so that data can be
extracted from the HTML. Is there anything like that in OFBiz? Has anyone
else done something similar?

-Adrian


Reply via email to