I am using lucene 2.9.3 (via Solr 1.4.1) on windows and am trying to understand ShingleFilter. I wrote the following code and find that if I provide more words than the actual phrase indexed in the field, then the search on that field fails (no score found with debugQuery=true).
Here is an example to reproduce, with field names: Id: 1 title_1: Nina Simone title_2: I put a spell on you Query (dismax) with: - “Nina Simone I put” <- Fails i.e. no score shown from title_1 search (using debugQuery) - “Nina Simone” <- SUCCESS But, when I used Solr’s Field Analysis with the ‘shingle’ field (given below) and tried “Nina Simone I put”, it succeeds. It’s only during the query that no score is provided. I also checked ‘parsedquery’ and it shows disjunctionMaxQuery issuing the string “Nina_Simone Simone_I I_put” to the title_1 field. title_1 and title_2 fields are of type ‘shingle’, defined as: <fieldType name="shingle" class="solr.TextField" positionIncrementGap="100" indexed="true" stored="true"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ShingleFilterFactory" maxShingleSize="2" outputUnigrams="false"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ShingleFilterFactory" maxShingleSize="2" outputUnigrams="false"/> </analyzer> </fieldType> Note that I also have a catchall field which is text. I have qf set to: 'id^2 catchall' and pf set to: 'title_1^1.5 title_2^1.2' If I am missing something or doing something wrong please let me know. -Ethan --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org