Sorry, I've figured out my own problem. There is a problem with the
way I create the xml document for indexing that was causing some of
the "comments" fields to not be listed correctly in the default search
field, "content".

On 10/12/07, Kevin Lewandowski <[EMAIL PROTECTED]> wrote:
> I've found an odd situation where solr is not returning all of the
> documents that I think it should. A search for "Geckoplp4-M" returns 3
> documents but I know that there are at least 100 documents with that
> string.
>
> Here is an example query for that phrase and the result set:
> http://localhost:9020/solr/select/?q=Geckoplp4-M&version=2.2&start=0&rows=10&indent=on&fl=comments,id
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> <lst name="responseHeader">
>  <int name="status">0</int>
>  <int name="QTime">0</int>
>  <lst name="params">
>   <str name="rows">10</str>
>   <str name="start">0</str>
>   <str name="indent">on</str>
>   <str name="fl">comments,id</str>
>   <str name="q">Geckoplp4-M</str>
>   <str name="version">2.2</str>
>  </lst>
> </lst>
> <result name="response" numFound="3" start="0">
>  <doc>
>   <str name="comments">Geckoplp4-M</str>
>   <str name="id">m2816500</str>
>  </doc>
>  <doc>
>   <str name="comments">toptrax recordings. Same tracks.
> Geckoplp4-M    </str>
>   <str name="id">m2816544</str>
>  </doc>
>  <doc>
>   <str name="comments">Geckoplp4-M</str>
>   <str name="id">m2815903</str>
>  </doc>
> </result>
> </response>
>
> Now here's an example of a search for two documents that I know have
> that string, but were not returned in the previous search:
> http://localhost:9020/solr/select/?q=id%3Am2816615+OR+id%3Am2816611&version=2.2&start=0&rows=10&indent=on&fl=id,comments
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> <lst name="responseHeader">
>  <int name="status">0</int>
>  <int name="QTime">1</int>
>  <lst name="params">
>   <str name="rows">10</str>
>   <str name="start">0</str>
>   <str name="indent">on</str>
>   <str name="fl">id,comments</str>
>   <str name="q">id:m2816615 OR id:m2816611</str>
>   <str name="version">2.2</str>
>  </lst>
> </lst>
> <result name="response" numFound="2" start="0">
>  <doc>
>   <str name="comments">Geckoplp4-M</str>
>   <str name="id">m2816611</str>
>  </doc>
>  <doc>
>   <str name="comments">Geckoplp4-M</str>
>   <str name="id">m2816615</str>
>  </doc>
> </result>
> </response>
>
> Here is the definition for the "comments" field:
> <field name="comments" type="text" indexed="true" stored="true"/>
>
> And here is the definition for a "text" field:
> <fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
>       <analyzer type="index">
>           <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>           <!-- in this example, we will only use synonyms at query time
>           <filter class="solr.SynonymFilterFactory"
> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>           -->
>           <!--<filter class="solr.StopFilterFactory" ignoreCase="true"/>-->
>           <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0"/>
>           <filter class="solr.LowerCaseFilterFactory"/>
>           <!--<filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>-->
>           <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>           <filter class="solr.ISOLatin1AccentFilterFactory" />
>       </analyzer>
>       <analyzer type="query">
>           <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>           <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>           <!--<filter class="solr.StopFilterFactory" ignoreCase="true"/>-->
>           <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0"/>
>           <filter class="solr.LowerCaseFilterFactory"/>
>           <!--<filter class="solr.EnglishPorterFilterFactory"
> protected="protwords.txt"/>-->
>           <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>           <filter class="solr.ISOLatin1AccentFilterFactory" />
>       </analyzer>
>     </fieldtype>
>
> Any ideas? Am I doing something wrong?
>
> thanks,
> Kevin
>

Reply via email to