Sorry, I've figured out my own problem. There is a problem with the way I create the xml document for indexing that was causing some of the "comments" fields to not be listed correctly in the default search field, "content".
On 10/12/07, Kevin Lewandowski <[EMAIL PROTECTED]> wrote: > I've found an odd situation where solr is not returning all of the > documents that I think it should. A search for "Geckoplp4-M" returns 3 > documents but I know that there are at least 100 documents with that > string. > > Here is an example query for that phrase and the result set: > http://localhost:9020/solr/select/?q=Geckoplp4-M&version=2.2&start=0&rows=10&indent=on&fl=comments,id > <?xml version="1.0" encoding="UTF-8"?> > <response> > <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">0</int> > <lst name="params"> > <str name="rows">10</str> > <str name="start">0</str> > <str name="indent">on</str> > <str name="fl">comments,id</str> > <str name="q">Geckoplp4-M</str> > <str name="version">2.2</str> > </lst> > </lst> > <result name="response" numFound="3" start="0"> > <doc> > <str name="comments">Geckoplp4-M</str> > <str name="id">m2816500</str> > </doc> > <doc> > <str name="comments">toptrax recordings. Same tracks. > Geckoplp4-M </str> > <str name="id">m2816544</str> > </doc> > <doc> > <str name="comments">Geckoplp4-M</str> > <str name="id">m2815903</str> > </doc> > </result> > </response> > > Now here's an example of a search for two documents that I know have > that string, but were not returned in the previous search: > http://localhost:9020/solr/select/?q=id%3Am2816615+OR+id%3Am2816611&version=2.2&start=0&rows=10&indent=on&fl=id,comments > <?xml version="1.0" encoding="UTF-8"?> > <response> > <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">1</int> > <lst name="params"> > <str name="rows">10</str> > <str name="start">0</str> > <str name="indent">on</str> > <str name="fl">id,comments</str> > <str name="q">id:m2816615 OR id:m2816611</str> > <str name="version">2.2</str> > </lst> > </lst> > <result name="response" numFound="2" start="0"> > <doc> > <str name="comments">Geckoplp4-M</str> > <str name="id">m2816611</str> > </doc> > <doc> > <str name="comments">Geckoplp4-M</str> > <str name="id">m2816615</str> > </doc> > </result> > </response> > > Here is the definition for the "comments" field: > <field name="comments" type="text" indexed="true" stored="true"/> > > And here is the definition for a "text" field: > <fieldtype name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <!-- in this example, we will only use synonyms at query time > <filter class="solr.SynonymFilterFactory" > synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> > --> > <!--<filter class="solr.StopFilterFactory" ignoreCase="true"/>--> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <!--<filter class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"/>--> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > <filter class="solr.ISOLatin1AccentFilterFactory" /> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" > synonyms="synonyms.txt" ignoreCase="true" expand="true"/> > <!--<filter class="solr.StopFilterFactory" ignoreCase="true"/>--> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <!--<filter class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"/>--> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > <filter class="solr.ISOLatin1AccentFilterFactory" /> > </analyzer> > </fieldtype> > > Any ideas? Am I doing something wrong? > > thanks, > Kevin >