Hi Ethan, You'll probably get better answers about Solr specific stuff on the solr-u...@a.l.o list.
Check out PositionFilterFactory - it may address your issue: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PositionFilterFactory Steve > -----Original Message----- > From: Ethan Collins [mailto:collins.eth...@gmail.com] > Sent: Tuesday, July 13, 2010 3:42 AM > To: java-user@lucene.apache.org > Subject: ShingleFilter failing with more terms than index phrase > > I am using lucene 2.9.3 (via Solr 1.4.1) on windows and am trying to > understand ShingleFilter. I wrote the following code and find that if I > provide more words than the actual phrase indexed in the field, then the > search on that field fails (no score found with debugQuery=true). > > Here is an example to reproduce, with field names: > Id: 1 > title_1: Nina Simone > title_2: I put a spell on you > > Query (dismax) with: > - “Nina Simone I put” <- Fails i.e. no score shown from title_1 search > (using debugQuery) > - “Nina Simone” <- SUCCESS > > But, when I used Solr’s Field Analysis with the ‘shingle’ field (given > below) and tried “Nina Simone I put”, it succeeds. It’s only during the > query that no score is provided. I also checked ‘parsedquery’ and it shows > disjunctionMaxQuery issuing the string “Nina_Simone Simone_I I_put” to the > title_1 field. > > title_1 and title_2 fields are of type ‘shingle’, defined as: > > <fieldType name="shingle" class="solr.TextField" > positionIncrementGap="100" indexed="true" stored="true"> > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.ShingleFilterFactory" > maxShingleSize="2" outputUnigrams="false"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.ShingleFilterFactory" > maxShingleSize="2" outputUnigrams="false"/> > </analyzer> > </fieldType> > > Note that I also have a catchall field which is text. I have qf set > to: 'id^2 catchall' and pf set to: 'title_1^1.5 title_2^1.2' > > If I am missing something or doing something wrong please let me know. > > -Ethan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org