I wonder if Boilerplate could be helpful here? Boilerplate is now integrated in Tika.
Otis ---- Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm >________________________________ > From: "Mark , N" <nipen.m...@gmail.com> >To: solr-user@lucene.apache.org >Sent: Thursday, May 24, 2012 1:39 AM >Subject: filtering footer information > >Is it possible to filter certain repeated footer information from text >documents while indexing to solr ? > >Are there any built-in filters similar to stop word filters ? > > > > >-- >Thanks, > >*Nipen Mark * > > >