Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Susheel Kumar
Agree, that's the challenge. Since ComplexPhraseQuery parser needs terms
analyzed/tokenized and if don't, it can't really operate at individual
tokens with fuzzy or wildcard matches.  The solution I can think of is to
execute query against both the fields (KeywordTokenized..) and
Non-KeywordTokenized fields and then boost the KeywordTokenized field
higher...

On Thu, Jun 15, 2017 at 1:20 PM, Max Bridgewater 
wrote:

> Thanks Susheel. The challenge is that if I search for the word "between"
> alone, I still get plenty of results. In a way I want the query to  match
> the document title exactly (up to a few characters) and the document title
> match the query exactly (up to a few characters). KeywordTokenizer allows
> that. But complexphrase does not seem to work with KeywordTokenizer.
>
> On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar 
> wrote:
>
> > CompledPhraseQuery parser is what you need to look
> > https://cwiki.apache.org/confluence/display/solr/Other+
> > Parsers#OtherParsers-ComplexPhraseQueryParser.
> > See below for e.g.
> >
> >
> >
> > http://localhost:8983/solr/techproducts/select?
> debugQuery=on=on=
> > manu:%22Bridge%20the%20gat~1%20between%20your%20skills%
> > 20and%20your%20goals%22=complexphrase
> >
> > On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater <
> > max.bridgewa...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am trying to do phrase exact match. For this, I use
> > > KeywordTokenizerFactory. This basically does what I want to do. My
> field
> > > type is defined as follows:
> > >
> > >  > > positionIncrementGap="100">
> > >   
> > > 
> > > 
> > >   
> > >   
> > > 
> > > 
> > >   
> > > 
> > >
> > >
> > > In addition to this, I want to tolerate typos of two or three letters.
> I
> > > thought fuzzy search could allow me to accept this margin of error. But
> > > this doesn't seem to work.
> > >
> > > A typical query I would have is:
> > >
> > > q=subjet:"Bridge the gap between your skills and your goals"
> > >
> > > Now, in this query, if I replace gap with gat, I was hoping I could do
> > > something such as:
> > >
> > > q=subjet:"Bridge the gat between your skills and your goals"~0.8
> > >
> > > But this doesn't quite do what I am trying to achieve.
> > >
> > > Any suggestion?
> > >
> >
>


Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread simon
I think that's because the KeywordTokenizer by definition produces a single
token (not a phrase).

Perhaps you could create two fields by a copyField - the one you already
have(field1), and one tokenized using StandardTokenizer or
WhiteSpaceTokenizer(field2) which will produce a phrase with multiple
tokens. Then construct a query which searches both  field1 for an exact
match, and field2 using ComplexQueryParser (use the localparams syntax) to
combine them. Boost the field1 (exact match).

HTH

-Simon

On Thu, Jun 15, 2017 at 1:20 PM, Max Bridgewater 
wrote:

> Thanks Susheel. The challenge is that if I search for the word "between"
> alone, I still get plenty of results. In a way I want the query to  match
> the document title exactly (up to a few characters) and the document title
> match the query exactly (up to a few characters). KeywordTokenizer allows
> that. But complexphrase does not seem to work with KeywordTokenizer.
>
> On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar 
> wrote:
>
> > CompledPhraseQuery parser is what you need to look
> > https://cwiki.apache.org/confluence/display/solr/Other+
> > Parsers#OtherParsers-ComplexPhraseQueryParser.
> > See below for e.g.
> >
> >
> >
> > http://localhost:8983/solr/techproducts/select?
> debugQuery=on=on=
> > manu:%22Bridge%20the%20gat~1%20between%20your%20skills%
> > 20and%20your%20goals%22=complexphrase
> >
> > On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater <
> > max.bridgewa...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am trying to do phrase exact match. For this, I use
> > > KeywordTokenizerFactory. This basically does what I want to do. My
> field
> > > type is defined as follows:
> > >
> > >  > > positionIncrementGap="100">
> > >   
> > > 
> > > 
> > >   
> > >   
> > > 
> > > 
> > >   
> > > 
> > >
> > >
> > > In addition to this, I want to tolerate typos of two or three letters.
> I
> > > thought fuzzy search could allow me to accept this margin of error. But
> > > this doesn't seem to work.
> > >
> > > A typical query I would have is:
> > >
> > > q=subjet:"Bridge the gap between your skills and your goals"
> > >
> > > Now, in this query, if I replace gap with gat, I was hoping I could do
> > > something such as:
> > >
> > > q=subjet:"Bridge the gat between your skills and your goals"~0.8
> > >
> > > But this doesn't quite do what I am trying to achieve.
> > >
> > > Any suggestion?
> > >
> >
>


Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Max Bridgewater
Thanks Susheel. The challenge is that if I search for the word "between"
alone, I still get plenty of results. In a way I want the query to  match
the document title exactly (up to a few characters) and the document title
match the query exactly (up to a few characters). KeywordTokenizer allows
that. But complexphrase does not seem to work with KeywordTokenizer.

On Thu, Jun 15, 2017 at 10:23 AM, Susheel Kumar 
wrote:

> CompledPhraseQuery parser is what you need to look
> https://cwiki.apache.org/confluence/display/solr/Other+
> Parsers#OtherParsers-ComplexPhraseQueryParser.
> See below for e.g.
>
>
>
> http://localhost:8983/solr/techproducts/select?debugQuery=on=on=
> manu:%22Bridge%20the%20gat~1%20between%20your%20skills%
> 20and%20your%20goals%22=complexphrase
>
> On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater <
> max.bridgewa...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I am trying to do phrase exact match. For this, I use
> > KeywordTokenizerFactory. This basically does what I want to do. My field
> > type is defined as follows:
> >
> >  > positionIncrementGap="100">
> >   
> > 
> > 
> >   
> >   
> > 
> > 
> >   
> > 
> >
> >
> > In addition to this, I want to tolerate typos of two or three letters. I
> > thought fuzzy search could allow me to accept this margin of error. But
> > this doesn't seem to work.
> >
> > A typical query I would have is:
> >
> > q=subjet:"Bridge the gap between your skills and your goals"
> >
> > Now, in this query, if I replace gap with gat, I was hoping I could do
> > something such as:
> >
> > q=subjet:"Bridge the gat between your skills and your goals"~0.8
> >
> > But this doesn't quite do what I am trying to achieve.
> >
> > Any suggestion?
> >
>


Re: Phrase Exact Match with Margin of Error

2017-06-15 Thread Susheel Kumar
CompledPhraseQuery parser is what you need to look
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser.
See below for e.g.



http://localhost:8983/solr/techproducts/select?debugQuery=on=on=manu:%22Bridge%20the%20gat~1%20between%20your%20skills%20and%20your%20goals%22=complexphrase

On Thu, Jun 15, 2017 at 5:59 AM, Max Bridgewater 
wrote:

> Hi,
>
> I am trying to do phrase exact match. For this, I use
> KeywordTokenizerFactory. This basically does what I want to do. My field
> type is defined as follows:
>
>  positionIncrementGap="100">
>   
> 
> 
>   
>   
> 
> 
>   
> 
>
>
> In addition to this, I want to tolerate typos of two or three letters. I
> thought fuzzy search could allow me to accept this margin of error. But
> this doesn't seem to work.
>
> A typical query I would have is:
>
> q=subjet:"Bridge the gap between your skills and your goals"
>
> Now, in this query, if I replace gap with gat, I was hoping I could do
> something such as:
>
> q=subjet:"Bridge the gat between your skills and your goals"~0.8
>
> But this doesn't quite do what I am trying to achieve.
>
> Any suggestion?
>


Phrase Exact Match with Margin of Error

2017-06-15 Thread Max Bridgewater
Hi,

I am trying to do phrase exact match. For this, I use
KeywordTokenizerFactory. This basically does what I want to do. My field
type is defined as follows:


  


  
  


  



In addition to this, I want to tolerate typos of two or three letters. I
thought fuzzy search could allow me to accept this margin of error. But
this doesn't seem to work.

A typical query I would have is:

q=subjet:"Bridge the gap between your skills and your goals"

Now, in this query, if I replace gap with gat, I was hoping I could do
something such as:

q=subjet:"Bridge the gat between your skills and your goals"~0.8

But this doesn't quite do what I am trying to achieve.

Any suggestion?