shingles work in analyzer but not real data

2010-09-01 Thread Jeff Rose
Hi, We are using SOLR to match query strings with a keyword database, where some of the keywords are actually more than one word. For example a keyword might be "apple pie" and we only want it to match for a query containing that word pair, but not one only containing "apple". Here is the relev

Re: shingles work in analyzer but not real data

2010-09-01 Thread Robert Muir
On Wed, Sep 1, 2010 at 8:21 AM, Jeff Rose wrote: > Hi, > We are using SOLR to match query strings with a keyword database, where > some of the keywords are actually more than one word. For example a > keyword > might be "apple pie" and we only want it to match for a query containing > that word

Re: shingles work in analyzer but not real data

2010-09-01 Thread Markus Jelsma
If your use-case is limited to this, why don't you encapsulate all queries in double quotes? On Wednesday 01 September 2010 14:21:47 Jeff Rose wrote: > Hi, > We are using SOLR to match query strings with a keyword database, where > some of the keywords are actually more than one word. For exa

Re: shingles work in analyzer but not real data

2010-09-02 Thread Jeff Rose
On Wed, Sep 1, 2010 at 3:35 PM, Robert Muir wrote: > On Wed, Sep 1, 2010 at 8:21 AM, Jeff Rose wrote: > > > Hi, > > We are using SOLR to match query strings with a keyword database, where > > some of the keywords are actually more than one word. For example a > > keyword > > might be "apple pi

RE: shingles work in analyzer but not real data

2010-09-02 Thread Steven A Rowe
> To: solr-user@lucene.apache.org > Subject: Re: shingles work in analyzer but not real data > > On Wed, Sep 1, 2010 at 3:35 PM, Robert Muir wrote: > > > On Wed, Sep 1, 2010 at 8:21 AM, Jeff Rose wrote: > > > > > Hi, > > > We are using SOLR to match query strings

Re: shingles work in analyzer but not real data

2010-09-02 Thread Jonathan Rochkind
I've run into this before too. Both the dismax and solr-lucene _query parsers_ will tokenize a query on whitespace _before_ they pass the query to any field analyzers. There are some reasons for this, lots of things wouldn't work if they didn't do this. But it makes your approach kind of har

Re: shingles work in analyzer but not real data

2010-09-02 Thread Dennis Gearon
On Thu, 9/2/10, Jonathan Rochkind wrote: > From: Jonathan Rochkind > Subject: Re: shingles work in analyzer but not real data > To: "solr-user@lucene.apache.org" > Cc: "Vishal Patel" , "Michiel Willekens" > > Date: Thursday, September 2, 2010,

Re: shingles work in analyzer but not real data

2010-09-03 Thread Jeff Rose
Thanks Steven and Jonathan, we got it working by using a combination of quoting and the PositionFilterFactory, like is shown below. The documentation for the position filter doesn't make much sense without understanding more about how positioning of tokens is taken into account, but it appears to

Re: shingles work in analyzer but not real data

2010-09-03 Thread Dennis Gearon
om/film.php --- On Fri, 9/3/10, Jeff Rose wrote: > From: Jeff Rose > Subject: Re: shingles work in analyzer but not real data > To: solr-user@lucene.apache.org > Date: Friday, September 3, 2010, 1:48 AM > Thanks Steven and Jonathan, we got it > working by using a combination of &

Re: shingles work in analyzer but not real data

2010-09-03 Thread Jeff Rose
'shingle' in search engine results/technology? > > > Dennis Gearon > > Signature Warning > > EARTH has a Right To Life, > otherwise we all die. > > Read 'Hot, Flat, and Crowded' > Laugh at http://www.yert.com/film.php >

Re: shingles work in analyzer but not real data

2010-09-03 Thread 朱炎詹
ze = 3. Be careful, the builing time of index & index dize could be dramatically long & large as the max shinlge size increases. Scott - Original Message - From: "Jeff Rose" To: Sent: Friday, September 03, 2010 5:35 PM Subject: Re: shingles work in analyzer but no

RE: shingles work in analyzer but not real data

2010-09-03 Thread Steven A Rowe
September 03, 2010 5:06 AM > To: solr-user@lucene.apache.org > Subject: Re: shingles work in analyzer but not real data > > Anyone got a definitive, authoritative link to the definition of a > 'shingle' in search engine results/technology? > >

Re: shingles work in analyzer but not real data

2010-09-03 Thread Lance Norskog
_indexes > > Steve > >> -Original Message- >> From: Dennis Gearon [mailto:gear...@sbcglobal.net] >> Sent: Friday, September 03, 2010 5:06 AM >> To: solr-user@lucene.apache.org >> Subject: Re: shingles work in analyzer but not real data >> >&g

Re: shingles work in analyzer but not real data

2010-09-03 Thread Dennis Gearon
ar...@sbcglobal.net] > >> Sent: Friday, September 03, 2010 5:06 AM > >> To: solr-user@lucene.apache.org > >> Subject: Re: shingles work in analyzer but not > real data > >> > >> Anyone got a definitive, authoritative link to the > definition of a &g

Re: shingles work in analyzer but not real data

2010-09-07 Thread Chris Hostetter
: Hi Robert, thanks for the response. I've looked into the query parsers a : bit and I did find that using the raw parser on a matching multi-word : keyword works correctly. I need to have shingling though, in order to : support query phrases. It seems odd to have the query parser emitting The