Manish - you're in luck. Nutch 1.12 was released and has Boilerpipe support. Check: https://issues.apache.org/jira/browse/NUTCH-961
Markus -----Original message----- > From:Manish Verma <m_ve...@apple.com> > Sent: Tuesday 28th June 2016 23:46 > To: user@nutch.apache.org > Subject: Remove Header from content > > Hi, > > I don’t want to index header and footer of content , I know we can make > changes in HtmlParser.java but I don’t want to change nutch core code, is > there any other way(plugin) to eleminate Header div from content. > > Thanks MV > >