Did you try to add a backslash to escape the "-" in Geckoplp4-M
(Geckoplp4\-M)


-----Original Message-----
From: Kevin Lewandowski [mailto:[EMAIL PROTECTED] 
Sent: Friday, October 12, 2007 9:40 PM
To: solr-user@lucene.apache.org
Subject: solr not finding all results

I've found an odd situation where solr is not returning all of the
documents that I think it should. A search for "Geckoplp4-M" returns 3
documents but I know that there are at least 100 documents with that
string.

Here is an example query for that phrase and the result set:
http://localhost:9020/solr/select/?q=Geckoplp4-M&version=2.2&start=0&row
s=10&indent=on&fl=comments,id
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">0</int>
 <lst name="params">
  <str name="rows">10</str>
  <str name="start">0</str>
  <str name="indent">on</str>
  <str name="fl">comments,id</str>
  <str name="q">Geckoplp4-M</str>
  <str name="version">2.2</str>
 </lst>
</lst>
<result name="response" numFound="3" start="0">
 <doc>
  <str name="comments">Geckoplp4-M</str>
  <str name="id">m2816500</str>
 </doc>
 <doc>
  <str name="comments">toptrax recordings. Same tracks.
Geckoplp4-M    </str>
  <str name="id">m2816544</str>
 </doc>
 <doc>
  <str name="comments">Geckoplp4-M</str>
  <str name="id">m2815903</str>
 </doc>
</result>
</response>

Now here's an example of a search for two documents that I know have
that string, but were not returned in the previous search:
http://localhost:9020/solr/select/?q=id%3Am2816615+OR+id%3Am2816611&vers
ion=2.2&start=0&rows=10&indent=on&fl=id,comments
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">1</int>
 <lst name="params">
  <str name="rows">10</str>
  <str name="start">0</str>
  <str name="indent">on</str>
  <str name="fl">id,comments</str>
  <str name="q">id:m2816615 OR id:m2816611</str>
  <str name="version">2.2</str>
 </lst>
</lst>
<result name="response" numFound="2" start="0">
 <doc>
  <str name="comments">Geckoplp4-M</str>
  <str name="id">m2816611</str>
 </doc>
 <doc>
  <str name="comments">Geckoplp4-M</str>
  <str name="id">m2816615</str>
 </doc>
</result>
</response>

Here is the definition for the "comments" field:
<field name="comments" type="text" indexed="true" stored="true"/>

And here is the definition for a "text" field:
<fieldtype name="text" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
          <tokenizer class="solr.WhitespaceTokenizerFactory"/>
          <!-- in this example, we will only use synonyms at query time
          <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
          -->
          <!--<filter class="solr.StopFilterFactory"
ignoreCase="true"/>-->
          <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0"/>
          <filter class="solr.LowerCaseFilterFactory"/>
          <!--<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>-->
          <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
          <filter class="solr.ISOLatin1AccentFilterFactory" />
      </analyzer>
      <analyzer type="query">
          <tokenizer class="solr.WhitespaceTokenizerFactory"/>
          <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
          <!--<filter class="solr.StopFilterFactory"
ignoreCase="true"/>-->
          <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0"/>
          <filter class="solr.LowerCaseFilterFactory"/>
          <!--<filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>-->
          <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
          <filter class="solr.ISOLatin1AccentFilterFactory" />
      </analyzer>
    </fieldtype>

Any ideas? Am I doing something wrong?

thanks,
Kevin

Reply via email to