Hi Prashant, I agree with Shai, that using Luke and printing out what the Document looks like before it goes into the index, are going to be your best bet for debugging this problem.
The problem you're having is that StandardAnalyzer does not break-up the hostname into separate terms, as it has a special case for hostnames and acronyms. This should work... +title:"rahul dravid" +url:"en.wikipedia.org" Thanks, Phil On Sun, Aug 2, 2009 at 10:14 AM, prashant ullegaddi<[email protected]> wrote: > Yes, I'm sure that title:"Rahul Dravid" is extracted properly, and there is > a document relevant to this query as well. > The following query and its results proves it: > > Enter query: > Searching for: +title:"rahul dravid" +url:wiki > 4 total matching documents > trec-id: clueweb09-enwp02-13-14368, URL: > http://en.wikipedia.org/wiki/Rahul_Dravid > trec-id: clueweb09-enwp01-83-11378, URL: > http://en.wikipedia.org/wiki/Rahul_S_Dravid > trec-id: clueweb09-en0011-08-22737, URL: > http://www.reference.com/browse/wiki/Rahul_Dravid > trec-id: clueweb09-enwp01-69-13556, URL: > http://en.wikipedia.org/wiki/Rahul_Sharad_Dravid > Press (q)uit or enter number to jump to a page. > > But see following query: > > Enter query: > +title:"rahul dravid" +url:"wikipedia" > Searching for: +title:"rahul dravid" +url:wikipedia > 0 total matching documents > Press (q)uit or enter number to jump to a page. > > Isn't it weird? > > -- Prashant. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
