RE: solr not finding all results

2007-10-15 Thread Ben Shlomo, Yatir
Did you try to add a backslash to escape the - in Geckoplp4-M
(Geckoplp4\-M)


-Original Message-
From: Kevin Lewandowski [mailto:[EMAIL PROTECTED] 
Sent: Friday, October 12, 2007 9:40 PM
To: solr-user@lucene.apache.org
Subject: solr not finding all results

I've found an odd situation where solr is not returning all of the
documents that I think it should. A search for Geckoplp4-M returns 3
documents but I know that there are at least 100 documents with that
string.

Here is an example query for that phrase and the result set:
http://localhost:9020/solr/select/?q=Geckoplp4-Mversion=2.2start=0row
s=10indent=onfl=comments,id
?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeader
 int name=status0/int
 int name=QTime0/int
 lst name=params
  str name=rows10/str
  str name=start0/str
  str name=indenton/str
  str name=flcomments,id/str
  str name=qGeckoplp4-M/str
  str name=version2.2/str
 /lst
/lst
result name=response numFound=3 start=0
 doc
  str name=commentsGeckoplp4-M/str
  str name=idm2816500/str
 /doc
 doc
  str name=commentstoptrax recordings. Same tracks.
Geckoplp4-M/str
  str name=idm2816544/str
 /doc
 doc
  str name=commentsGeckoplp4-M/str
  str name=idm2815903/str
 /doc
/result
/response

Now here's an example of a search for two documents that I know have
that string, but were not returned in the previous search:
http://localhost:9020/solr/select/?q=id%3Am2816615+OR+id%3Am2816611vers
ion=2.2start=0rows=10indent=onfl=id,comments
?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeader
 int name=status0/int
 int name=QTime1/int
 lst name=params
  str name=rows10/str
  str name=start0/str
  str name=indenton/str
  str name=flid,comments/str
  str name=qid:m2816615 OR id:m2816611/str
  str name=version2.2/str
 /lst
/lst
result name=response numFound=2 start=0
 doc
  str name=commentsGeckoplp4-M/str
  str name=idm2816611/str
 /doc
 doc
  str name=commentsGeckoplp4-M/str
  str name=idm2816615/str
 /doc
/result
/response

Here is the definition for the comments field:
field name=comments type=text indexed=true stored=true/

And here is the definition for a text field:
fieldtype name=text class=solr.TextField
positionIncrementGap=100
  analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  !-- in this example, we will only use synonyms at query time
  filter class=solr.SynonymFilterFactory
synonyms=index_synonyms.txt ignoreCase=true expand=false/
  --
  !--filter class=solr.StopFilterFactory
ignoreCase=true/--
  filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0/
  filter class=solr.LowerCaseFilterFactory/
  !--filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/--
  filter class=solr.RemoveDuplicatesTokenFilterFactory/
  filter class=solr.ISOLatin1AccentFilterFactory /
  /analyzer
  analyzer type=query
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/
  !--filter class=solr.StopFilterFactory
ignoreCase=true/--
  filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0/
  filter class=solr.LowerCaseFilterFactory/
  !--filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/--
  filter class=solr.RemoveDuplicatesTokenFilterFactory/
  filter class=solr.ISOLatin1AccentFilterFactory /
  /analyzer
/fieldtype

Any ideas? Am I doing something wrong?

thanks,
Kevin


Re: solr not finding all results

2007-10-12 Thread Kevin Lewandowski
Sorry, I've figured out my own problem. There is a problem with the
way I create the xml document for indexing that was causing some of
the comments fields to not be listed correctly in the default search
field, content.

On 10/12/07, Kevin Lewandowski [EMAIL PROTECTED] wrote:
 I've found an odd situation where solr is not returning all of the
 documents that I think it should. A search for Geckoplp4-M returns 3
 documents but I know that there are at least 100 documents with that
 string.

 Here is an example query for that phrase and the result set:
 http://localhost:9020/solr/select/?q=Geckoplp4-Mversion=2.2start=0rows=10indent=onfl=comments,id
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeader
  int name=status0/int
  int name=QTime0/int
  lst name=params
   str name=rows10/str
   str name=start0/str
   str name=indenton/str
   str name=flcomments,id/str
   str name=qGeckoplp4-M/str
   str name=version2.2/str
  /lst
 /lst
 result name=response numFound=3 start=0
  doc
   str name=commentsGeckoplp4-M/str
   str name=idm2816500/str
  /doc
  doc
   str name=commentstoptrax recordings. Same tracks.
 Geckoplp4-M/str
   str name=idm2816544/str
  /doc
  doc
   str name=commentsGeckoplp4-M/str
   str name=idm2815903/str
  /doc
 /result
 /response

 Now here's an example of a search for two documents that I know have
 that string, but were not returned in the previous search:
 http://localhost:9020/solr/select/?q=id%3Am2816615+OR+id%3Am2816611version=2.2start=0rows=10indent=onfl=id,comments
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeader
  int name=status0/int
  int name=QTime1/int
  lst name=params
   str name=rows10/str
   str name=start0/str
   str name=indenton/str
   str name=flid,comments/str
   str name=qid:m2816615 OR id:m2816611/str
   str name=version2.2/str
  /lst
 /lst
 result name=response numFound=2 start=0
  doc
   str name=commentsGeckoplp4-M/str
   str name=idm2816611/str
  /doc
  doc
   str name=commentsGeckoplp4-M/str
   str name=idm2816615/str
  /doc
 /result
 /response

 Here is the definition for the comments field:
 field name=comments type=text indexed=true stored=true/

 And here is the definition for a text field:
 fieldtype name=text class=solr.TextField positionIncrementGap=100
   analyzer type=index
   tokenizer class=solr.WhitespaceTokenizerFactory/
   !-- in this example, we will only use synonyms at query time
   filter class=solr.SynonymFilterFactory
 synonyms=index_synonyms.txt ignoreCase=true expand=false/
   --
   !--filter class=solr.StopFilterFactory ignoreCase=true/--
   filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0/
   filter class=solr.LowerCaseFilterFactory/
   !--filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/--
   filter class=solr.RemoveDuplicatesTokenFilterFactory/
   filter class=solr.ISOLatin1AccentFilterFactory /
   /analyzer
   analyzer type=query
   tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt ignoreCase=true expand=true/
   !--filter class=solr.StopFilterFactory ignoreCase=true/--
   filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0/
   filter class=solr.LowerCaseFilterFactory/
   !--filter class=solr.EnglishPorterFilterFactory
 protected=protwords.txt/--
   filter class=solr.RemoveDuplicatesTokenFilterFactory/
   filter class=solr.ISOLatin1AccentFilterFactory /
   /analyzer
 /fieldtype

 Any ideas? Am I doing something wrong?

 thanks,
 Kevin