Re: Implementing a negative keyword filter in index

Markus Jelsma Tue, 01 Feb 2011 18:01:03 -0800

Hi,

It would be better if you open a separate thread on the JUnit question.


About the filter issue, are you using Nutch' search or Solr? Both use Lucene 
and are capable of queries with operators that prohibit a term. If that's Solr 
you're using, please consult the appropriate docs, wiki and mailings list on 
how to procede. I have no experience with Nutch' search capability but as it 
also uses Lucene i could imagine it allows these operators to be used as well.

Using these operators you can exclude certain terms in documents from showing 
up in your search. If you filter those documents out beforehand, you cannot 
query for them later. 

Check this for information on the LuceneQParser:
http://lucene.apache.org/java/2_9_1/queryparsersyntax.html

Cheers,

> Hi folks,
> 
>  I am sorry for adding another question to the same mail. I am also writing
> a plug-in extending HtmlParser. How do I test it with JUnit?
> 
>  I see the "filter" method takes Content content, ParseResult
> parseResult,HTMLMetaTags metaTags, DocumentFragment doc as argument. How
> can I generate these parameters of the test purpose?
> 
> Thanks,
> Abi
> 
> On Tue, Feb 1, 2011 at 12:10 PM, .: Abhishek :. <[email protected]> wrote:
> > Hi all,
> > 
> >  I am planning to implement a negative keyword indexer such that if a
> > 
> > negative keyword appears in a segment I should never show up it during
> > the search. I have the following steps in mind, please let me know if
> > its right.
> > 
> >    - Writing a plug-in
> >    
> >       - Extend the IndexingFilter.
> >       - Do a NutchDocument.removeField for the negative keyword.
> >       - return the doc
> >   
> >   Now the questions are,
> >   
> >    - The NutchDocument is always mapped as a HTML page, so if I am doing
> >    the thing above, Am I really removing the segment from getting indexed
> >    or am I preventing the page from being indexed?
> >  
> >  Also, please let me know what I am intending to do is right? Thanks
> >  again
> > 
> > all for your time.
> > 
> > Cheers,
> > Abhi

Re: Implementing a negative keyword filter in index

Reply via email to