>> Could someone give me a hint about conducting a boolean search by 
>> using the Lucene/Nutch API's directly? Just some starting points to 
>> look at.
> Nutch does not support boolean queries as lucene does. Only the minus 
> operator (-) is supported to exclude words from the search. If you 
> want boolean query support, then you should modify the Query.java. In 
> this class there are subclasses called Clause, Term and Phrase. A 
> clause is either a term or a phrase. This class is constructed by the 
> Query.parse() method. Parse method delegates to the 
> NutchAnalysis.parseQuery(). NutchAnaylsis is generated from 
> NutchAnalysis.jj. This JavaCC document lexical analysis and parsing. 
> And finally, QueryFilters.filter method run all query filters through 
> the Query and these filters convert the nutch Query to lucene 
> BooleanQuery. You should definitely check query-basic for this.
>
> To add boolean query support (esp. OR ) you need to modify all the 
> above classes in some way : )
>
> Alternatively, you can just construct the Boolean Query and the pass 
> it to the index servers bypassing nutch Query class.

I did something different. The "OR" searches I needed where all related 
to a custom indexed field I had created with a plugin. This field is 
called zip3. What I did is to modify my custom zip3 QueryFilter. I 
include the code here (perhaps somebody could find the idea useful). 
This query filter transforms a term like: zip3:AAA-BBB-CCC to a search 
that could have been written as zip3:AAA OR zip3:BBB OR zip3:CCC.

Here's my code:

public class Zip3QueryFilter extends RawFieldQueryFilter
{
    private Configuration conf;
    float myBoost = 0f;

    public Zip3QueryFilter()
    {
        super("zip3");
    }

    public void setConf(Configuration conf)
    {
        this.conf = conf;
        setBoost(conf.getFloat("query.zip3.boost", 0.0f));
    }

    public Configuration getConf()
    {
        return this.conf;
    }

    @Override
    public BooleanQuery filter(Query input, BooleanQuery output)
            throws QueryException
    {
        for(Clause c : input.getClauses())
        {
            if(!c.getField().equals("zip3"))
                continue;

            String value = c.getTerm().toString();
            BooleanQuery bq = new BooleanQuery();

            for(String z3 : value.split("-"))
            {
                TermQuery clause = new TermQuery(new Term(
                        "zip3", z3));
                bq.add(clause, Occur.SHOULD);
            }

            bq.setBoost(myBoost);

            output
                .add(
                    bq,
                    (c.isProhibited() ? BooleanClause.Occur.MUST_NOT
                            : (c.isRequired() ? BooleanClause.Occur.MUST
                                    : BooleanClause.Occur.SHOULD)));
        }
        return output;
    }

    @Override
    protected void setBoost(float boost)
    {
        super.setBoost(boost);
        myBoost = boost;
    }
}


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to