RE: solr not finding all results
Did you try to add a backslash to escape the - in Geckoplp4-M (Geckoplp4\-M) -Original Message- From: Kevin Lewandowski [mailto:[EMAIL PROTECTED] Sent: Friday, October 12, 2007 9:40 PM To: solr-user@lucene.apache.org Subject: solr not finding all results I've found an odd situation where solr is not returning all of the documents that I think it should. A search for Geckoplp4-M returns 3 documents but I know that there are at least 100 documents with that string. Here is an example query for that phrase and the result set: http://localhost:9020/solr/select/?q=Geckoplp4-Mversion=2.2start=0row s=10indent=onfl=comments,id ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime0/int lst name=params str name=rows10/str str name=start0/str str name=indenton/str str name=flcomments,id/str str name=qGeckoplp4-M/str str name=version2.2/str /lst /lst result name=response numFound=3 start=0 doc str name=commentsGeckoplp4-M/str str name=idm2816500/str /doc doc str name=commentstoptrax recordings. Same tracks. Geckoplp4-M/str str name=idm2816544/str /doc doc str name=commentsGeckoplp4-M/str str name=idm2815903/str /doc /result /response Now here's an example of a search for two documents that I know have that string, but were not returned in the previous search: http://localhost:9020/solr/select/?q=id%3Am2816615+OR+id%3Am2816611vers ion=2.2start=0rows=10indent=onfl=id,comments ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=rows10/str str name=start0/str str name=indenton/str str name=flid,comments/str str name=qid:m2816615 OR id:m2816611/str str name=version2.2/str /lst /lst result name=response numFound=2 start=0 doc str name=commentsGeckoplp4-M/str str name=idm2816611/str /doc doc str name=commentsGeckoplp4-M/str str name=idm2816615/str /doc /result /response Here is the definition for the comments field: field name=comments type=text indexed=true stored=true/ And here is the definition for a text field: fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !--filter class=solr.StopFilterFactory ignoreCase=true/-- filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/-- filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.ISOLatin1AccentFilterFactory / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ !--filter class=solr.StopFilterFactory ignoreCase=true/-- filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/-- filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.ISOLatin1AccentFilterFactory / /analyzer /fieldtype Any ideas? Am I doing something wrong? thanks, Kevin
Re: solr not finding all results
Sorry, I've figured out my own problem. There is a problem with the way I create the xml document for indexing that was causing some of the comments fields to not be listed correctly in the default search field, content. On 10/12/07, Kevin Lewandowski [EMAIL PROTECTED] wrote: I've found an odd situation where solr is not returning all of the documents that I think it should. A search for Geckoplp4-M returns 3 documents but I know that there are at least 100 documents with that string. Here is an example query for that phrase and the result set: http://localhost:9020/solr/select/?q=Geckoplp4-Mversion=2.2start=0rows=10indent=onfl=comments,id ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime0/int lst name=params str name=rows10/str str name=start0/str str name=indenton/str str name=flcomments,id/str str name=qGeckoplp4-M/str str name=version2.2/str /lst /lst result name=response numFound=3 start=0 doc str name=commentsGeckoplp4-M/str str name=idm2816500/str /doc doc str name=commentstoptrax recordings. Same tracks. Geckoplp4-M/str str name=idm2816544/str /doc doc str name=commentsGeckoplp4-M/str str name=idm2815903/str /doc /result /response Now here's an example of a search for two documents that I know have that string, but were not returned in the previous search: http://localhost:9020/solr/select/?q=id%3Am2816615+OR+id%3Am2816611version=2.2start=0rows=10indent=onfl=id,comments ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=rows10/str str name=start0/str str name=indenton/str str name=flid,comments/str str name=qid:m2816615 OR id:m2816611/str str name=version2.2/str /lst /lst result name=response numFound=2 start=0 doc str name=commentsGeckoplp4-M/str str name=idm2816611/str /doc doc str name=commentsGeckoplp4-M/str str name=idm2816615/str /doc /result /response Here is the definition for the comments field: field name=comments type=text indexed=true stored=true/ And here is the definition for a text field: fieldtype name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ !-- in this example, we will only use synonyms at query time filter class=solr.SynonymFilterFactory synonyms=index_synonyms.txt ignoreCase=true expand=false/ -- !--filter class=solr.StopFilterFactory ignoreCase=true/-- filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/-- filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.ISOLatin1AccentFilterFactory / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ !--filter class=solr.StopFilterFactory ignoreCase=true/-- filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ !--filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/-- filter class=solr.RemoveDuplicatesTokenFilterFactory/ filter class=solr.ISOLatin1AccentFilterFactory / /analyzer /fieldtype Any ideas? Am I doing something wrong? thanks, Kevin