Re: Help with explain query syntax
Thank you very much! On Wed, Mar 9, 2011 at 2:01 AM, Yonik Seeley wrote: > It's probably the WordDelimiterFilter: > > > org.apache.solr.analysis.WordDelimiterFilterFactory > args:{preserveOriginal: > > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0 > > generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 } > > Get rid of the preserveOriginal="1" in the query analyzer. > > -Yonik > http://lucidimagination.com > > On Tue, Mar 1, 2011 at 9:01 AM, Glòria Martínez > wrote: > > Hello, > > > > I can't understand why this query is not matching anything. Could someone > > help me please? > > > > *Query* > > > http://localhost:8894/solr/select?q=linguajob.pl&qf=company_name&wt=xml&qt=dismax&debugQuery=on&explainOther=id%3A1 > > > > > > - > > > > 0 > > 12 > > - > > > > id:1 > > on > > linguajob.pl > > company_name > > xml > > dismax > > > > > > > > - > > > > linguajob.pl > > linguajob.pl > > - > > > > +DisjunctionMaxQuery((company_name:"(linguajob.pl linguajob) pl")~0.01) > () > > > > - > > > > +(company_name:"(linguajob.pl linguajob) pl")~0.01 () > > > > > > id:1 > > - > > > > - > > > > > > 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited > > clause(s) > > 0.0 = no match on required clause (company_name:"(linguajob.pllinguajob) > > pl") *<- What does this syntax (field:"(token1 token2) token3") mean?* > >0.0 = (NON-MATCH) fieldWeight(company_name:"(linguajob.pl linguajob) > pl" > > in 0), product of: > > 0.0 = tf(phraseFreq=0.0) > > 1.6137056 = idf(company_name:"(linguajob.pl linguajob) pl") > > 0.4375 = fieldNorm(field=company_name, doc=0) > > > > > > DisMaxQParser > > > > > > + > > > > ... > > > > > > > > > > There's only one document indexed: > > > > *Document* > > http://localhost:8894/solr/select?q=1&qf=id&wt=xml&qt=dismax > > > > - > > > > 0 > > 2 > > - > > > > id > > xml > > dismax > > 1 > > > > > > - > > > > - > > > > LinguaJob.pl > > 1 > > 6 > > 2011-03-01T11:14:24.553Z > > > > > > > > > > *Solr Admin Schema* > > Field: company_name > > Field Type: text > > Properties: Indexed, Tokenized, Stored > > Schema: Indexed, Tokenized, Stored > > Index: Indexed, Tokenized, Stored > > > > Position Increment Gap: 100 > > > > Index Analyzer: org.apache.solr.analysis.TokenizerChain Details > > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > > Filters: > > schema.UnicodeNormalizationFilterFactory args:{composed: false > > remove_modifiers: true fold: true version: java6 remove_diacritics: true > } > > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > > ignoreCase: true enablePositionIncrements: true } > > org.apache.solr.analysis.WordDelimiterFilterFactory > args:{preserveOriginal: > > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 1 > > generateWordParts: 1 catenateAll: 0 catenateNumbers: 1 } > > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > > > > Query Analyzer: org.apache.solr.analysis.TokenizerChain Details > > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > > Filters: > > schema.UnicodeNormalizationFilterFactory args:{composed: false > > remove_modifiers: true fold: true version: java6 remove_diacritics: true > } > > org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: > synonyms.txt > > expand: true ignoreCase: true } > > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > > ignoreCase: true } > > org.apache.solr.analysis.WordDelimiterFilterFactory > args:{preserveOriginal: > > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0 > > generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 } > > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > > > > Docs: 1 > > Distinct: 5 > > Top 5 terms > > term frequency > > lingua 1 > > linguajob.pl 1 > > linguajobpl 1 > > pl 1 > > job 1 > > > > *Solr Analysis* > > Field name: company_name > > Field value (Index): LinguaJob.pl > > Field value (Query): linguajob.pl > > > > *Index Analyzer > > > > org.apache.solr.analysis.WhitespaceTokenizerFactory {} > > term position 1 > > term text LinguaJob.pl > > term type word > > source start,end 0,12 > > payload > > > > schema.UnicodeNormalizationFilterFactory {composed=false, > > remove_modifiers=true, fold=true, version=java6, remove_diacritics=true} > > term position 1 > > term text LinguaJob.pl > > term type word > > source start,end 0,12 > > payload > > > > org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt, > > ignoreCase=true, enablePositionIncrements=true} > > term position 1 > > term text LinguaJob.pl > > term type word > > source start,end 0,12 > > payload > > > > org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=1, > > splitOnCaseChange=1, generateNumberParts=1, catenateWords=1, > > generateWordParts=1, catenateAll=0, catenate
Re: Help with explain query syntax
It's probably the WordDelimiterFilter: > org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal: > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0 > generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 } Get rid of the preserveOriginal="1" in the query analyzer. -Yonik http://lucidimagination.com On Tue, Mar 1, 2011 at 9:01 AM, Glòria Martínez wrote: > Hello, > > I can't understand why this query is not matching anything. Could someone > help me please? > > *Query* > http://localhost:8894/solr/select?q=linguajob.pl&qf=company_name&wt=xml&qt=dismax&debugQuery=on&explainOther=id%3A1 > > > - > > 0 > 12 > - > > id:1 > on > linguajob.pl > company_name > xml > dismax > > > > - > > linguajob.pl > linguajob.pl > - > > +DisjunctionMaxQuery((company_name:"(linguajob.pl linguajob) pl")~0.01) () > > - > > +(company_name:"(linguajob.pl linguajob) pl")~0.01 () > > > id:1 > - > > - > > > 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited > clause(s) > 0.0 = no match on required clause (company_name:"(linguajob.pl linguajob) > pl") *<- What does this syntax (field:"(token1 token2) token3") mean?* > 0.0 = (NON-MATCH) fieldWeight(company_name:"(linguajob.pl linguajob) pl" > in 0), product of: > 0.0 = tf(phraseFreq=0.0) > 1.6137056 = idf(company_name:"(linguajob.pl linguajob) pl") > 0.4375 = fieldNorm(field=company_name, doc=0) > > > DisMaxQParser > > > + > > ... > > > > > There's only one document indexed: > > *Document* > http://localhost:8894/solr/select?q=1&qf=id&wt=xml&qt=dismax > > - > > 0 > 2 > - > > id > xml > dismax > 1 > > > - > > - > > LinguaJob.pl > 1 > 6 > 2011-03-01T11:14:24.553Z > > > > > *Solr Admin Schema* > Field: company_name > Field Type: text > Properties: Indexed, Tokenized, Stored > Schema: Indexed, Tokenized, Stored > Index: Indexed, Tokenized, Stored > > Position Increment Gap: 100 > > Index Analyzer: org.apache.solr.analysis.TokenizerChain Details > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > Filters: > schema.UnicodeNormalizationFilterFactory args:{composed: false > remove_modifiers: true fold: true version: java6 remove_diacritics: true } > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > ignoreCase: true enablePositionIncrements: true } > org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal: > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 1 > generateWordParts: 1 catenateAll: 0 catenateNumbers: 1 } > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > > Query Analyzer: org.apache.solr.analysis.TokenizerChain Details > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory > Filters: > schema.UnicodeNormalizationFilterFactory args:{composed: false > remove_modifiers: true fold: true version: java6 remove_diacritics: true } > org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt > expand: true ignoreCase: true } > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt > ignoreCase: true } > org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal: > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0 > generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 } > org.apache.solr.analysis.LowerCaseFilterFactory args:{} > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{} > > Docs: 1 > Distinct: 5 > Top 5 terms > term frequency > lingua 1 > linguajob.pl 1 > linguajobpl 1 > pl 1 > job 1 > > *Solr Analysis* > Field name: company_name > Field value (Index): LinguaJob.pl > Field value (Query): linguajob.pl > > *Index Analyzer > > org.apache.solr.analysis.WhitespaceTokenizerFactory {} > term position 1 > term text LinguaJob.pl > term type word > source start,end 0,12 > payload > > schema.UnicodeNormalizationFilterFactory {composed=false, > remove_modifiers=true, fold=true, version=java6, remove_diacritics=true} > term position 1 > term text LinguaJob.pl > term type word > source start,end 0,12 > payload > > org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt, > ignoreCase=true, enablePositionIncrements=true} > term position 1 > term text LinguaJob.pl > term type word > source start,end 0,12 > payload > > org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=1, > splitOnCaseChange=1, generateNumberParts=1, catenateWords=1, > generateWordParts=1, catenateAll=0, catenateNumbers=1} > term position 123 > term text LinguaJob.plJobpl > LinguaLinguaJobpl > term type wordwordword > wordword > source start,end 0,126,910,12 > 0,60,12 > payload > > org.apache.solr.analysis.LowerCaseFilterFactory {} > term position 123 > term text linguajob.pljobpl > lingualinguajobpl > term type wordwordword > wordword > source start,end 0,126,910,12 > 0,60,12 > payload > > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {} > term posi
Re: Help with explain query syntax
: : +DisjunctionMaxQuery((company_name:"(linguajob.pl linguajob) pl")~0.01) () : you can see the crux of your problem in this query string it seems you have a query time synonym in place to *expand* linguajob.pl into [linguajob.pl] and [linguajob] [pl] but query time synonym expansion of multiword queries doesn't work -- what it is ultimatley requiring is that a doc contain "linguajob.pl" and "linguajob" at the same term position, followed by "pl" this is not what you have indexed. This type of specific example is warned against on the wiki... http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory -Hoss