subject:"Re\: Help with explain query syntax"

Re: Help with explain query syntax

2011-03-22 Thread Glòria Martínez

Thank you very much!

On Wed, Mar 9, 2011 at 2:01 AM, Yonik Seeley wrote:

> It's probably the WordDelimiterFilter:
>
> > org.apache.solr.analysis.WordDelimiterFilterFactory
> args:{preserveOriginal:
> > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0
> > generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 }
>
> Get rid of the preserveOriginal="1" in the query analyzer.
>
> -Yonik
> http://lucidimagination.com
>
> On Tue, Mar 1, 2011 at 9:01 AM, Glòria Martínez
>  wrote:
> > Hello,
> >
> > I can't understand why this query is not matching anything. Could someone
> > help me please?
> >
> > *Query*
> >
> http://localhost:8894/solr/select?q=linguajob.pl&qf=company_name&wt=xml&qt=dismax&debugQuery=on&explainOther=id%3A1
> >
> > 
> > -
> > 
> > 0
> > 12
> > -
> > 
> > id:1
> > on
> > linguajob.pl
> > company_name
> > xml
> > dismax
> > 
> > 
> > 
> > -
> > 
> > linguajob.pl
> > linguajob.pl
> > -
> > 
> > +DisjunctionMaxQuery((company_name:"(linguajob.pl linguajob) pl")~0.01)
> ()
> > 
> > -
> > 
> > +(company_name:"(linguajob.pl linguajob) pl")~0.01 ()
> > 
> > 
> > id:1
> > -
> > 
> > -
> > 
> >
> > 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
> > clause(s)
> >  0.0 = no match on required clause (company_name:"(linguajob.pllinguajob)
> > pl") *<- What does this syntax (field:"(token1 token2) token3") mean?*
> >0.0 = (NON-MATCH) fieldWeight(company_name:"(linguajob.pl linguajob)
> pl"
> > in 0), product of:
> >  0.0 = tf(phraseFreq=0.0)
> >  1.6137056 = idf(company_name:"(linguajob.pl linguajob) pl")
> >  0.4375 = fieldNorm(field=company_name, doc=0)
> > 
> > 
> > DisMaxQParser
> > 
> > 
> > +
> > 
> > ...
> > 
> >
> >
> >
> > There's only one document indexed:
> >
> > *Document*
> > http://localhost:8894/solr/select?q=1&qf=id&wt=xml&qt=dismax
> > 
> > -
> > 
> > 0
> > 2
> > -
> > 
> > id
> > xml
> > dismax
> > 1
> > 
> > 
> > -
> > 
> > -
> > 
> > LinguaJob.pl
> > 1
> > 6
> > 2011-03-01T11:14:24.553Z
> > 
> > 
> > 
> >
> > *Solr Admin Schema*
> > Field: company_name
> > Field Type: text
> > Properties: Indexed, Tokenized, Stored
> > Schema: Indexed, Tokenized, Stored
> > Index: Indexed, Tokenized, Stored
> >
> > Position Increment Gap: 100
> >
> > Index Analyzer: org.apache.solr.analysis.TokenizerChain Details
> > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
> > Filters:
> > schema.UnicodeNormalizationFilterFactory args:{composed: false
> > remove_modifiers: true fold: true version: java6 remove_diacritics: true
> }
> > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt
> > ignoreCase: true enablePositionIncrements: true }
> > org.apache.solr.analysis.WordDelimiterFilterFactory
> args:{preserveOriginal:
> > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 1
> > generateWordParts: 1 catenateAll: 0 catenateNumbers: 1 }
> > org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> >
> > Query Analyzer: org.apache.solr.analysis.TokenizerChain Details
> > Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
> > Filters:
> > schema.UnicodeNormalizationFilterFactory args:{composed: false
> > remove_modifiers: true fold: true version: java6 remove_diacritics: true
> }
> > org.apache.solr.analysis.SynonymFilterFactory args:{synonyms:
> synonyms.txt
> > expand: true ignoreCase: true }
> > org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt
> > ignoreCase: true }
> > org.apache.solr.analysis.WordDelimiterFilterFactory
> args:{preserveOriginal:
> > 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0
> > generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 }
> > org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> > org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> >
> > Docs: 1
> > Distinct: 5
> > Top 5 terms
> > term frequency
> > lingua 1
> > linguajob.pl 1
> > linguajobpl 1
> > pl 1
> > job 1
> >
> > *Solr Analysis*
> > Field name: company_name
> > Field value (Index): LinguaJob.pl
> > Field value (Query): linguajob.pl
> >
> > *Index Analyzer
> >
> > org.apache.solr.analysis.WhitespaceTokenizerFactory {}
> > term position 1
> > term text LinguaJob.pl
> > term type word
> > source start,end 0,12
> > payload
> >
> > schema.UnicodeNormalizationFilterFactory {composed=false,
> > remove_modifiers=true, fold=true, version=java6, remove_diacritics=true}
> > term position 1
> > term text LinguaJob.pl
> > term type word
> > source start,end 0,12
> > payload
> >
> > org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
> > ignoreCase=true, enablePositionIncrements=true}
> > term position 1
> > term text LinguaJob.pl
> > term type word
> > source start,end 0,12
> > payload
> >
> > org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=1,
> > splitOnCaseChange=1, generateNumberParts=1, catenateWords=1,
> > generateWordParts=1, catenateAll=0, catenate

Re: Help with explain query syntax

2011-03-08 Thread Yonik Seeley

It's probably the WordDelimiterFilter:

> org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal:
> 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0
> generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 }

Get rid of the preserveOriginal="1" in the query analyzer.

-Yonik
http://lucidimagination.com

On Tue, Mar 1, 2011 at 9:01 AM, Glòria Martínez
 wrote:
> Hello,
>
> I can't understand why this query is not matching anything. Could someone
> help me please?
>
> *Query*
> http://localhost:8894/solr/select?q=linguajob.pl&qf=company_name&wt=xml&qt=dismax&debugQuery=on&explainOther=id%3A1
>
> 
> -
> 
> 0
> 12
> -
> 
> id:1
> on
> linguajob.pl
> company_name
> xml
> dismax
> 
> 
> 
> -
> 
> linguajob.pl
> linguajob.pl
> -
> 
> +DisjunctionMaxQuery((company_name:"(linguajob.pl linguajob) pl")~0.01) ()
> 
> -
> 
> +(company_name:"(linguajob.pl linguajob) pl")~0.01 ()
> 
> 
> id:1
> -
> 
> -
> 
>
> 0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
> clause(s)
>  0.0 = no match on required clause (company_name:"(linguajob.pl linguajob)
> pl") *<- What does this syntax (field:"(token1 token2) token3") mean?*
>    0.0 = (NON-MATCH) fieldWeight(company_name:"(linguajob.pl linguajob) pl"
> in 0), product of:
>      0.0 = tf(phraseFreq=0.0)
>      1.6137056 = idf(company_name:"(linguajob.pl linguajob) pl")
>      0.4375 = fieldNorm(field=company_name, doc=0)
> 
> 
> DisMaxQParser
> 
> 
> +
> 
> ...
> 
>
>
>
> There's only one document indexed:
>
> *Document*
> http://localhost:8894/solr/select?q=1&qf=id&wt=xml&qt=dismax
> 
> -
> 
> 0
> 2
> -
> 
> id
> xml
> dismax
> 1
> 
> 
> -
> 
> -
> 
> LinguaJob.pl
> 1
> 6
> 2011-03-01T11:14:24.553Z
> 
> 
> 
>
> *Solr Admin Schema*
> Field: company_name
> Field Type: text
> Properties: Indexed, Tokenized, Stored
> Schema: Indexed, Tokenized, Stored
> Index: Indexed, Tokenized, Stored
>
> Position Increment Gap: 100
>
> Index Analyzer: org.apache.solr.analysis.TokenizerChain Details
> Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:
> schema.UnicodeNormalizationFilterFactory args:{composed: false
> remove_modifiers: true fold: true version: java6 remove_diacritics: true }
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal:
> 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 1
> generateWordParts: 1 catenateAll: 0 catenateNumbers: 1 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
>
> Query Analyzer: org.apache.solr.analysis.TokenizerChain Details
> Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:
> schema.UnicodeNormalizationFilterFactory args:{composed: false
> remove_modifiers: true fold: true version: java6 remove_diacritics: true }
> org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt
> expand: true ignoreCase: true }
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt
> ignoreCase: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal:
> 1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0
> generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
>
> Docs: 1
> Distinct: 5
> Top 5 terms
> term frequency
> lingua 1
> linguajob.pl 1
> linguajobpl 1
> pl 1
> job 1
>
> *Solr Analysis*
> Field name: company_name
> Field value (Index): LinguaJob.pl
> Field value (Query): linguajob.pl
>
> *Index Analyzer
>
> org.apache.solr.analysis.WhitespaceTokenizerFactory {}
> term position 1
> term text LinguaJob.pl
> term type word
> source start,end 0,12
> payload
>
> schema.UnicodeNormalizationFilterFactory {composed=false,
> remove_modifiers=true, fold=true, version=java6, remove_diacritics=true}
> term position 1
> term text LinguaJob.pl
> term type word
> source start,end 0,12
> payload
>
> org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
> ignoreCase=true, enablePositionIncrements=true}
> term position 1
> term text LinguaJob.pl
> term type word
> source start,end 0,12
> payload
>
> org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=1,
> splitOnCaseChange=1, generateNumberParts=1, catenateWords=1,
> generateWordParts=1, catenateAll=0, catenateNumbers=1}
> term position 123
> term text LinguaJob.plJobpl
> LinguaLinguaJobpl
> term type wordwordword
> wordword
> source start,end 0,126,910,12
> 0,60,12
> payload
>
> org.apache.solr.analysis.LowerCaseFilterFactory {}
> term position 123
> term text linguajob.pljobpl
> lingualinguajobpl
> term type wordwordword
> wordword
> source start,end 0,126,910,12
> 0,60,12
> payload
>
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}
> term posi

Re: Help with explain query syntax

2011-03-08 Thread Chris Hostetter


: 
: +DisjunctionMaxQuery((company_name:"(linguajob.pl linguajob) pl")~0.01) ()
: 

you can see the crux of your problem in this query string

it seems you have a query time synonym in place to *expand* linguajob.pl 
into [linguajob.pl] and [linguajob] [pl] but query time synonym expansion 
of multiword queries doesn't work -- what it is ultimatley requiring is 
that a doc contain "linguajob.pl" and "linguajob" at the same term 
position, followed by "pl"

this is not what you have indexed.

This type of specific example is warned against on the wiki...

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory


-Hoss

Re: Help with explain query syntax

Re: Help with explain query syntax

Re: Help with explain query syntax

3 matches

Site Navigation

Mail list logo

Footer information