Absolutely Markus.

Maybe you would consider using the parse-tika plugin within the
application. As the Nutch code DOES NOT DO any parsing, it will also
give you a great understanding of how the parse plugins fit into the
various core classes.

Thanks

On Fri, Dec 23, 2011 at 10:09 AM, Markus Jelsma
<markus.jel...@openindex.io> wrote:
> I would recommend using Tika for parsing. It does much more and is being
> maintained as well.
>
> http://tika.apache.org/
>
> On Thursday 22 December 2011 13:41:20 jepse wrote:
>> Hi,
>>
>> my concern is to use the Nutch HtmlParser as a standalone Application.
>> Therefor i followed the instructions for RunNutchInEclipse. Now i have a
>> working Eclipse Project, wich i can use to start my claimed plugin in a
>> standalone Application (running the main class in HtmlParser.java). Now i
>> need to extract this Runtime Configuration for a standalone App. is there a
>> way to execute a specific class with the relevant classpath?
>>
>> Cheers, Philippe
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/HtmlParser-parse-html-plugin-tp3606486p
>> 3606486.html Sent from the Nutch - User mailing list archive at Nabble.com.
>
> --
> Markus Jelsma - CTO - Openindex



-- 
Lewis

Reply via email to