Re: Phrase Exact Match with Margin of Error
Agree, that's the challenge. Since ComplexPhraseQuery parser needs terms analyzed/tokenized and if don't, it can't really operate at individual tokens with fuzzy or wildcard matches. The solution I can think of is to execute query against both the fields (KeywordTokenized..) and Non-KeywordTokenized fields and then boost the KeywordTokenized field higher... On Thu, Jun 15, 2017 at 1:20 PM, Max Bridgewaterwrote: > Thanks Susheel. The challenge is that if I search for the word "between" > alone, I still get plenty of results. In a way I want the query to match > the document title exactly (up to a few characters) and the document title > match the query exactly (up to a few characters). KeywordTokenizer allows > that. But complexphrase does not seem to work with KeywordTokenizer. > > On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar > wrote: > > > CompledPhraseQuery parser is what you need to look > > https://cwiki.apache.org/confluence/display/solr/Other+ > > Parsers#OtherParsers-ComplexPhraseQueryParser. > > See below for e.g. > > > > > > > > http://localhost:8983/solr/techproducts/select? > debugQuery=on=on= > > manu:%22Bridge%20the%20gat~1%20between%20your%20skills% > > 20and%20your%20goals%22=complexphrase > > > > On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater < > > max.bridgewa...@gmail.com> > > wrote: > > > > > Hi, > > > > > > I am trying to do phrase exact match. For this, I use > > > KeywordTokenizerFactory. This basically does what I want to do. My > field > > > type is defined as follows: > > > > > > > > positionIncrementGap="100"> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In addition to this, I want to tolerate typos of two or three letters. > I > > > thought fuzzy search could allow me to accept this margin of error. But > > > this doesn't seem to work. > > > > > > A typical query I would have is: > > > > > > q=subjet:"Bridge the gap between your skills and your goals" > > > > > > Now, in this query, if I replace gap with gat, I was hoping I could do > > > something such as: > > > > > > q=subjet:"Bridge the gat between your skills and your goals"~0.8 > > > > > > But this doesn't quite do what I am trying to achieve. > > > > > > Any suggestion? > > > > > >
Re: Phrase Exact Match with Margin of Error
I think that's because the KeywordTokenizer by definition produces a single token (not a phrase). Perhaps you could create two fields by a copyField - the one you already have(field1), and one tokenized using StandardTokenizer or WhiteSpaceTokenizer(field2) which will produce a phrase with multiple tokens. Then construct a query which searches both field1 for an exact match, and field2 using ComplexQueryParser (use the localparams syntax) to combine them. Boost the field1 (exact match). HTH -Simon On Thu, Jun 15, 2017 at 1:20 PM, Max Bridgewaterwrote: > Thanks Susheel. The challenge is that if I search for the word "between" > alone, I still get plenty of results. In a way I want the query to match > the document title exactly (up to a few characters) and the document title > match the query exactly (up to a few characters). KeywordTokenizer allows > that. But complexphrase does not seem to work with KeywordTokenizer. > > On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar > wrote: > > > CompledPhraseQuery parser is what you need to look > > https://cwiki.apache.org/confluence/display/solr/Other+ > > Parsers#OtherParsers-ComplexPhraseQueryParser. > > See below for e.g. > > > > > > > > http://localhost:8983/solr/techproducts/select? > debugQuery=on=on= > > manu:%22Bridge%20the%20gat~1%20between%20your%20skills% > > 20and%20your%20goals%22=complexphrase > > > > On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater < > > max.bridgewa...@gmail.com> > > wrote: > > > > > Hi, > > > > > > I am trying to do phrase exact match. For this, I use > > > KeywordTokenizerFactory. This basically does what I want to do. My > field > > > type is defined as follows: > > > > > > > > positionIncrementGap="100"> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In addition to this, I want to tolerate typos of two or three letters. > I > > > thought fuzzy search could allow me to accept this margin of error. But > > > this doesn't seem to work. > > > > > > A typical query I would have is: > > > > > > q=subjet:"Bridge the gap between your skills and your goals" > > > > > > Now, in this query, if I replace gap with gat, I was hoping I could do > > > something such as: > > > > > > q=subjet:"Bridge the gat between your skills and your goals"~0.8 > > > > > > But this doesn't quite do what I am trying to achieve. > > > > > > Any suggestion? > > > > > >
Re: Phrase Exact Match with Margin of Error
Thanks Susheel. The challenge is that if I search for the word "between" alone, I still get plenty of results. In a way I want the query to match the document title exactly (up to a few characters) and the document title match the query exactly (up to a few characters). KeywordTokenizer allows that. But complexphrase does not seem to work with KeywordTokenizer. On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumarwrote: > CompledPhraseQuery parser is what you need to look > https://cwiki.apache.org/confluence/display/solr/Other+ > Parsers#OtherParsers-ComplexPhraseQueryParser. > See below for e.g. > > > > http://localhost:8983/solr/techproducts/select?debugQuery=on=on= > manu:%22Bridge%20the%20gat~1%20between%20your%20skills% > 20and%20your%20goals%22=complexphrase > > On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater < > max.bridgewa...@gmail.com> > wrote: > > > Hi, > > > > I am trying to do phrase exact match. For this, I use > > KeywordTokenizerFactory. This basically does what I want to do. My field > > type is defined as follows: > > > > > positionIncrementGap="100"> > > > > > > > > > > > > > > > > > > > > > > > > In addition to this, I want to tolerate typos of two or three letters. I > > thought fuzzy search could allow me to accept this margin of error. But > > this doesn't seem to work. > > > > A typical query I would have is: > > > > q=subjet:"Bridge the gap between your skills and your goals" > > > > Now, in this query, if I replace gap with gat, I was hoping I could do > > something such as: > > > > q=subjet:"Bridge the gat between your skills and your goals"~0.8 > > > > But this doesn't quite do what I am trying to achieve. > > > > Any suggestion? > > >
Re: Phrase Exact Match with Margin of Error
CompledPhraseQuery parser is what you need to look https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser. See below for e.g. http://localhost:8983/solr/techproducts/select?debugQuery=on=on=manu:%22Bridge%20the%20gat~1%20between%20your%20skills%20and%20your%20goals%22=complexphrase On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewaterwrote: > Hi, > > I am trying to do phrase exact match. For this, I use > KeywordTokenizerFactory. This basically does what I want to do. My field > type is defined as follows: > > positionIncrementGap="100"> > > > > > > > > > > > > In addition to this, I want to tolerate typos of two or three letters. I > thought fuzzy search could allow me to accept this margin of error. But > this doesn't seem to work. > > A typical query I would have is: > > q=subjet:"Bridge the gap between your skills and your goals" > > Now, in this query, if I replace gap with gat, I was hoping I could do > something such as: > > q=subjet:"Bridge the gat between your skills and your goals"~0.8 > > But this doesn't quite do what I am trying to achieve. > > Any suggestion? >
Phrase Exact Match with Margin of Error
Hi, I am trying to do phrase exact match. For this, I use KeywordTokenizerFactory. This basically does what I want to do. My field type is defined as follows: In addition to this, I want to tolerate typos of two or three letters. I thought fuzzy search could allow me to accept this margin of error. But this doesn't seem to work. A typical query I would have is: q=subjet:"Bridge the gap between your skills and your goals" Now, in this query, if I replace gap with gat, I was hoping I could do something such as: q=subjet:"Bridge the gat between your skills and your goals"~0.8 But this doesn't quite do what I am trying to achieve. Any suggestion?