Erik Hatcher wrote:
Rest assured that human-readable query expressions aren't going away
at all. I don't think Mark even implied that.
That's right. The proposal is *not* to replace what is already there -
QueryParser will always have a useful role to play supporting the
"Google-like" query syntax familiar to millions.
I'd just like to see another full-featured query representation for the
reasons already outlined.
Picking up on some points raised:
Re: MoreLikeThis queries.
Yes, they can be usefully wrapped as queries (see attached simple
example). In fact it was my attempts at bastardising QueryParser to
support them that brought home it's limitations. I ended up with a
subclass hack that (mis)used the field name to parse a query string
"like:123" where 123 was a doc id. With the QueryParser syntax I was not
able to pass other parameters which MoreLikeThis could usefully use to
control the behaviour of this query type eg choice of fieldname(s) used,
max number of terms generated, minNumberShouldTerms to match etc etc.
This is not unusual, each query type has potentially multiple optional
parameters that tweak it's behaviour. If I don't have a query language
that names the parameters explicitly (say, XML) I end up having to
define what looks like a function with a long list of parameters: "like
(123,,,4,,,)". Ack.
Here's a psuedo-code example that throws together some of the more
obscure parts of Lucene not represented in the existing QueryParser as
an illustration of how this could look in a more wide-reaching parser.
Imagine the user has selected an example doc #44 as something they are
interested in, on the subject of "hockey" but they prefer to see
documents that don't talk about ice hockey
<BoostingQuery>
<MatchQuery>
<MoreLikeThisQuery percentTermsToMatch="0.25f"
docId="44">
<CompareField name="contents"/>
<CompareField name="title"/>
</MoreLikeThis>
</MatchQuery>
<DowngradeQuery demoteValue="0.5" >
<SimpleQuery defaultField="contents">
<queryText>"ice hockey" OR puck OR
rink</queryText>
</SimpleQuery>
</DowngradeQuery>
</BoostingQuery>
BoostingQuery is a class that can use a second query to demote the
results of a first query if it matches (see here:
http://wiki.apache.org/jakarta-lucene/CommunityContributions)
For this and other forms of query to be able to plug into new parser the
Query objects just need to adhere to bean conventions to be
automatically wired in an ANT/Spring like way using reflection.
For example, the implementation of BoostingQuery would need to have
getter/setter properties for "MatchQuery" and "downgradeQuery".
Note in this example that the existing QueryParser syntax is usefully
used in "SimpleQuery" to avoid making the XML too verbose.
There's much detail to be added in how this would work in practice but I
thought I'd post it here to show the general shape of one possible
direction.
package com.inperspective.lucene.query;
import java.io.IOException;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.Query;
/**
* A simple wrapper for MoreLikeThis for use in scenarios where a Query object
is required eg
* in custom QueryParser extensions. At query.rewrite() time the reader is used
to construct the
* actual MoreLikeThis object and obtain the real Query object.
* TODO write JUnit Test
* @author maharwood
*/
public class MoreLikeThisQuery extends Query
{
private int docId;
private String[] moreLikeFields;
private Analyzer analyzer;
float percentTermsToMatch=0.5f;
/**
* @param docId
* @param moreLikeFields
*/
public MoreLikeThisQuery(int docId, String[] moreLikeFields, Analyzer
analyzer)
{
this.docId=docId;
this.moreLikeFields=moreLikeFields;
this.analyzer=analyzer;
}
public Query rewrite(IndexReader reader) throws IOException
{
MoreLikeThis mlt=new MoreLikeThis(reader);
mlt.setFieldNames(moreLikeFields);
mlt.setAnalyzer(analyzer);
BooleanQuery bq= (BooleanQuery) mlt.like(docId);
BooleanClause[] clauses = bq.getClauses();
bq.setMinimumNumberShouldMatch((int)(clauses.length*
percentTermsToMatch));
return bq;
}
/* (non-Javadoc)
* @see org.apache.lucene.search.Query#toString(java.lang.String)
*/
public String toString(String field)
{
return "like:"+docId;
}
public float getPercentTermsToMatch()
{
return percentTermsToMatch;
}
public void setPercentTermsToMatch(float percentTermsToMatch)
{
this.percentTermsToMatch = percentTermsToMatch;
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]