You only need an IndexingFilter if you didn't do the logic in the ParseFilter, 
or, if you want to do something with metadata added by two or more different 
ParseFilters.

You can use multiple Indexing- or ParseFilters, not a problem.

 
-----Original message-----
> From:Michael Chen <yiningchen2...@u.northwestern.edu>
> Sent: Wednesday 2nd August 2017 21:23
> To: user@nutch.apache.org
> Subject: Re: ParseFilter and IndexingFilter
> 
> 
> Hi Markus,
> 
> Thanks for the quick response! Please let me know at any point if I
> should just read some part of the code. But I'm guessing from the stored
> data in HBase (with Nutch 2.x), that "parse" changed (in my case,
> cleaned up the html tags in "content") the "Document".
> 
> Do you mean that parse only adds meta-data somewhere waiting for
> indexing filters to index it into HBase? Maybe I'm not understanding
> "indexing" correctly.
> 
> I'm trying to use the new jsoup-extractor to parse (and index) certain
> fields with CSS selectors. I also want to keep the indexing by
> index-basic and index-anchor, and preferably the raw html/data as well.
> Am I on the right track?
> 
> Thank you!
> 
> Michael
> 
> 
> On 08/02/2017 12:06 PM, Markus Jelsma wrote:
> > Hi,
> >
> > ParseFilter can add metadata to parsed records. IndexingFilter can access 
> > that data and do something with it prior to indexing the metadata fields 
> > added earlier by the ParseFilter.
> >
> > If you just want to index the values added by the ParseFilter, you can just 
> > use index-metadata to index it directly. Only use an IndexingFilter if you 
> > need additional logic.
> >
> > Regards,
> > Markus
> >
> >   
> >   
> > -----Original message-----
> >> From:Michael Chen <yiningchen2...@u.northwestern.edu>
> >> Sent: Wednesday 2nd August 2017 20:58
> >> To: user@nutch.apache.org
> >> Subject: ParseFilter and IndexingFilter
> >>
> >> Hi,
> >>
> >> Does anyone know how multiple ParseFilters and IndexingFilters work
> >> together, e.g. does the first parse affect the second, does the one
> >> index operation affect the next? Given that the factories generate
> >> multiple in the first place... I couldn't find a definitive answer in
> >> the docs and it would be great if someone can help answer this question.
> >> Thanks in advance.
> >>
> >> Best regards,
> >>
> >> Michael
> >>
> >>
> >>
> 
> 

Reply via email to