You don't say what is the analyzer you are using, by default i think
that standartanalyzer is used,  i think that the same analyzer is
being used by luke and your indexer,  but as you are creating the
query programatically the content is not analyzed, the terms are
different.

You could use luke to know what terms you have indexed, and how the
query is rewrited once it has pased the analyzer.
for example, your query :
+url:"http://some.website.com/"; +contentlength:1234
is rewriten with the standardanalyzer to (notice the lack of "://") :
+url:"http some.website.com" +contentlength:1234

On Mon, Jan 12, 2009 at 6:57 AM, Mark Cottman-Fields <mar...@qpac.com.au> wrote:
> Hi All
>
> I'm currently testing a lucene.net index for a website.
>
> The Luke search utility and my C# code give me different results for
> searches. There's probably a simple explanation, but I can't find it.
>
> The idea here is to check for changes using the url and the content
> length.
>
> C# code:
>
> (searcher is Lucene.Net.Search.IndexSearcher instance)
>
> public bool IsIndexed(Uri url, int stringcontent) {
> int foundCount = 0;
> Lucene.Net.Search.BooleanQuery bq = new
> Lucene.Net.Search.BooleanQuery();
> bq.Add(new Lucene.Net.Search.TermQuery(new Lucene.Net.Index.Term("url",
> "\"" + url.AbsoluteUri + "\"")),
> Lucene.Net.Search.BooleanClause.Occur.MUST);
> bq.Add(new Lucene.Net.Search.TermQuery(new
> Lucene.Net.Index.Term("length", stringcontent.ToString())),
> Lucene.Net.Search.BooleanClause.Occur.MUST);
> Lucene.Net.Search.Hits hits = searcher.Search(bq);
> foundCount = hits.Length();
> return foundCount > 0;
> }
>
> A search like:
> +url:"http://some.website.com/"; +contentlength:1234
>
> Finds exactly one item in Luke (the expected behaviour, if the url is
> indexed and has that length exactly), but returns nothing in my code
> above. They are both using the same index.
>
> Any ideas?
>
> BTW, great project, thank you!
>
> Mark
>
> ********** Disclaimer **********
> This email, together with any attachments, is intended for the named 
> recipient only.  This email may contain information which is confidential, of 
> a private nature or which is subject to legal professional privilege or 
> copyright.  Accordingly, any form of disclosure, modification, distribution 
> and/or publication of this email message is prohibited unless expressly 
> authorised by the sender acting with the authority of or on behalf of the 
> Queensland Performing Arts Centre.
>
> If you have received this email by mistake, please inform the sender as soon 
> as possible and delete the message and any copies of this message from your 
> computer system network. The confidentiality, privacy or legal professional 
> privilege attached to this email is not waived or destroyed by that mistake.
>
> Unless expressly attributed, the views expressed in this email do not 
> necessarily represent the views of the Queensland Performing Arts Centre.
> ********************************
>
>

Reply via email to