Manish - you're in luck. Nutch 1.12 was released and has Boilerpipe support. 
Check:
https://issues.apache.org/jira/browse/NUTCH-961

Markus

 
 
-----Original message-----
> From:Manish Verma <m_ve...@apple.com>
> Sent: Tuesday 28th June 2016 23:46
> To: user@nutch.apache.org
> Subject: Remove Header from content
> 
> Hi,
> 
> I don’t want to index header and footer of content , I know we can make 
> changes in HtmlParser.java but I don’t want to change nutch core code, is 
> there any other way(plugin) to eleminate Header div from content.
> 
> Thanks MV
> 
> 

Reply via email to