Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2014-01-22 Thread Salman Akram
Apologies for the late response as this mail was lost somewhere in filters.

Issue was that CommonGramsQueryFilterFactory should be used for searching
and CommonGramsFilterFactory for indexing. We were using
CommonGramsFilterFactory for both due to which it was not dropping single
tokens for common grams in a phrase query.

I will go through the link you sent and see if it needs any explanation.
Thanks!


Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2013-12-10 Thread Salman Akram
Thanks!! Using CommonGramsQueryFilter resolved the issue.

This was not there in 1.4.1 and also for some reason was not there in SOLR
4 Release Notes that we studied before upgrading.


On Tue, Dec 10, 2013 at 9:55 AM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Salman,

 I never used commons gram filer but I remember there are two classes in
 this family. CommonGramsFilter and CommonGramsQueryFilter. It seems that
 CommonsGramsQueryFilter is what you are after.


 http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsQueryFilter.html


 http://khaidoan.wikidot.com/solr-common-gram-filter





 On Tuesday, December 10, 2013 6:43 AM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:
 We used that syntax in 1.4.1 when Surround was not part of SOLR and has to
 register it. Didn't know that it is now part of SOLR. Any ways this is a
 red herring since I have totally removed Surround and the issue remains
 there.

 Below is the debug info when I give a simple phrase query having common
 words with default Query Parser. What I don't understand is that why is it
 including single tokens as well? I have also included the relevant config
 part below.

 rawquerystring: Contents:\only be\,
 querystring: Contents:\only be\,
 parsedquery: MultiPhraseQuery(Contents:\(only only_be) be\),
 parsedquery_toString: Contents:\(only only_be) be\,

 QParser: LuceneQParser,

 =

 fieldtype name=text class=solr.TextField
 analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StandardFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.CommonGramsFilterFactory words=commonwords.txt
 ignoreCase=true/
 /analyzer
 /fieldtype



 On Mon, Dec 9, 2013 at 7:46 PM, Erik Hatcher erik.hatc...@gmail.com
 wrote:

  But again, as Ahmet mentioned… it doesn't look like the surround query
  parser is actually being used.   The debug output also mentioned the
 query
  parser used, but that part wasn't provided below.  One thing to note
 here,
  the surround query parser is not available in 1.4.1.   It also looks like
  you're surrounding your query with angle brackets, as it says query
 string
  is {!surround}Contents:only be, which is not correct syntax.  And one
  of the most important things to note here is that the surround query
 parser
  does NOT use the analysis chain of the field, see 
  http://wiki.apache.org/solr/SurroundQueryParser#Limitations.  In short,
  you're going to have to do some work to get common grams factored into a
  surround query (such as maybe calling to the analysis request hander to
  parse the query before sending it to the surround query parser).
 
  Erik
 
 
  On Dec 9, 2013, at 9:36 AM, Salman Akram 
  salman.ak...@northbaysolutions.net wrote:
 
   Yup on debugging I found that its coming in Analyzer. We are using
  Standard
   Analyzer. It seems to be a SOLR 4 issue with Common Grams. Not sure if
  its
   a bug or I am missing some config.
  
  
   On Mon, Dec 9, 2013 at 2:03 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   Hi Salman,
   I am confused because with surround no analysis is applied at query
  time.
   I suspect that surround query parser is not kicking in. You should see
   SrndQuery or something like at parser query section.
  
  
  
   On Monday, December 9, 2013 6:24 AM, Salman Akram 
   salman.ak...@northbaysolutions.net wrote:
  
   All,
  
   I posted this sub-issue with another issue few days back but maybe it
  was
   not obvious so posting it on a separate thread.
  
   We recently migrated to SOLR 4.6. We use Common Grams but queries with
   words in the CG list have slowed down. On debugging we found that for
 CG
   words the parser is adding individual tokens of those words in the
 query
   too which ends up slowing it. Below is an example:
  
   Query = only be
  
   Here is what debug shows. I have highlighted the red part which is
   different in both versions i.e. SOLR 4.6 is making it a
 multiphrasequery
   and adding individual tokens too. Can someone help?
  
   SOLR 4.6 (takes 20 secs)
   str name=rawquerystring{!surround}Contents:only be/str
   str name=querystring{!surround}Contents:only be/str
   str name=parsedqueryMultiPhraseQuery(Contents:(only only_be)
   be)/str
   str name=parsedquery_toStringContents:(only only_be) be/str
  
   SOLR 1.4.1 (takes 1 sec)
   str name=rawquerystring{!surround}Contents:only be/str
   str name=querystring{!surround}Contents:only be/str
   str name=parsedqueryContents:only_be/str
   str name=parsedquery_toStringContents:only_be/str--
  
  
   Regards,

  
   Salman Akram
  
  
  
  
   --
   Regards,
  
   Salman Akram
 
 


 --
 Regards,

 Salman Akram




-- 
Regards,

Salman Akram


Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2013-12-10 Thread Ahmet Arslan
Hi Salman,

I personally do not perform stopword removal. So are you saying 
CommonGramsFilter is not useful without CommonGramsFilterQueryFilter? If yes, 
do you want to add a comment to confluence explaining this? 

https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-CommonGramsFilter





On Tuesday, December 10, 2013 1:17 PM, Salman Akram 
salman.ak...@northbaysolutions.net wrote:
Thanks!! Using CommonGramsQueryFilter resolved the issue.

This was not there in 1.4.1 and also for some reason was not there in SOLR
4 Release Notes that we studied before upgrading.


On Tue, Dec 10, 2013 at 9:55 AM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Salman,

 I never used commons gram filer but I remember there are two classes in
 this family. CommonGramsFilter and CommonGramsQueryFilter. It seems that
 CommonsGramsQueryFilter is what you are after.


 http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsQueryFilter.html


 http://khaidoan.wikidot.com/solr-common-gram-filter





 On Tuesday, December 10, 2013 6:43 AM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:
 We used that syntax in 1.4.1 when Surround was not part of SOLR and has to
 register it. Didn't know that it is now part of SOLR. Any ways this is a
 red herring since I have totally removed Surround and the issue remains
 there.

 Below is the debug info when I give a simple phrase query having common
 words with default Query Parser. What I don't understand is that why is it
 including single tokens as well? I have also included the relevant config
 part below.

 rawquerystring: Contents:\only be\,
 querystring: Contents:\only be\,
 parsedquery: MultiPhraseQuery(Contents:\(only only_be) be\),
 parsedquery_toString: Contents:\(only only_be) be\,

 QParser: LuceneQParser,

 =

 fieldtype name=text class=solr.TextField
 analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StandardFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.CommonGramsFilterFactory words=commonwords.txt
 ignoreCase=true/
 /analyzer
 /fieldtype



 On Mon, Dec 9, 2013 at 7:46 PM, Erik Hatcher erik.hatc...@gmail.com
 wrote:

  But again, as Ahmet mentioned… it doesn't look like the surround query
  parser is actually being used.   The debug output also mentioned the
 query
  parser used, but that part wasn't provided below.  One thing to note
 here,
  the surround query parser is not available in 1.4.1.   It also looks like
  you're surrounding your query with angle brackets, as it says query
 string
  is {!surround}Contents:only be, which is not correct syntax.  And one
  of the most important things to note here is that the surround query
 parser
  does NOT use the analysis chain of the field, see 
  http://wiki.apache.org/solr/SurroundQueryParser#Limitations.  In short,
  you're going to have to do some work to get common grams factored into a
  surround query (such as maybe calling to the analysis request hander to
  parse the query before sending it to the surround query parser).
 
          Erik
 
 
  On Dec 9, 2013, at 9:36 AM, Salman Akram 
  salman.ak...@northbaysolutions.net wrote:
 
   Yup on debugging I found that its coming in Analyzer. We are using
  Standard
   Analyzer. It seems to be a SOLR 4 issue with Common Grams. Not sure if
  its
   a bug or I am missing some config.
  
  
   On Mon, Dec 9, 2013 at 2:03 PM, Ahmet Arslan iori...@yahoo.com
 wrote:
  
   Hi Salman,
   I am confused because with surround no analysis is applied at query
  time.
   I suspect that surround query parser is not kicking in. You should see
   SrndQuery or something like at parser query section.
  
  
  
   On Monday, December 9, 2013 6:24 AM, Salman Akram 
   salman.ak...@northbaysolutions.net wrote:
  
   All,
  
   I posted this sub-issue with another issue few days back but maybe it
  was
   not obvious so posting it on a separate thread.
  
   We recently migrated to SOLR 4.6. We use Common Grams but queries with
   words in the CG list have slowed down. On debugging we found that for
 CG
   words the parser is adding individual tokens of those words in the
 query
   too which ends up slowing it. Below is an example:
  
   Query = only be
  
   Here is what debug shows. I have highlighted the red part which is
   different in both versions i.e. SOLR 4.6 is making it a
 multiphrasequery
   and adding individual tokens too. Can someone help?
  
   SOLR 4.6 (takes 20 secs)
   str name=rawquerystring{!surround}Contents:only be/str
   str name=querystring{!surround}Contents:only be/str
   str name=parsedqueryMultiPhraseQuery(Contents:(only only_be)
   be)/str
   str name=parsedquery_toStringContents:(only only_be) be/str
  
   SOLR 1.4.1 (takes 1 sec)
   str name=rawquerystring{!surround}Contents:only be/str
   str name=querystring{!surround}Contents:only be/str
   str name=parsedqueryContents:only_be/str
   str 

Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2013-12-09 Thread Ahmet Arslan
Hi Salman,
I am confused because with surround no analysis is applied at query time. I 
suspect that surround query parser is not kicking in. You should see SrndQuery 
or something like at parser query section.



On Monday, December 9, 2013 6:24 AM, Salman Akram 
salman.ak...@northbaysolutions.net wrote:
 
All,

I posted this sub-issue with another issue few days back but maybe it was
not obvious so posting it on a separate thread.

We recently migrated to SOLR 4.6. We use Common Grams but queries with
words in the CG list have slowed down. On debugging we found that for CG
words the parser is adding individual tokens of those words in the query
too which ends up slowing it. Below is an example:

Query = only be

Here is what debug shows. I have highlighted the red part which is
different in both versions i.e. SOLR 4.6 is making it a multiphrasequery
and adding individual tokens too. Can someone help?

SOLR 4.6 (takes 20 secs)
str name=rawquerystring{!surround}Contents:only be/str
str name=querystring{!surround}Contents:only be/str
str name=parsedqueryMultiPhraseQuery(Contents:(only only_be) be)/str
str name=parsedquery_toStringContents:(only only_be) be/str

SOLR 1.4.1 (takes 1 sec)
str name=rawquerystring{!surround}Contents:only be/str
str name=querystring{!surround}Contents:only be/str
str name=parsedqueryContents:only_be/str
str name=parsedquery_toStringContents:only_be/str--


Regards,

Salman Akram

Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2013-12-09 Thread Salman Akram
Yup on debugging I found that its coming in Analyzer. We are using Standard
Analyzer. It seems to be a SOLR 4 issue with Common Grams. Not sure if its
a bug or I am missing some config.


On Mon, Dec 9, 2013 at 2:03 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Salman,
 I am confused because with surround no analysis is applied at query time.
 I suspect that surround query parser is not kicking in. You should see
 SrndQuery or something like at parser query section.



 On Monday, December 9, 2013 6:24 AM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:

 All,

 I posted this sub-issue with another issue few days back but maybe it was
 not obvious so posting it on a separate thread.

 We recently migrated to SOLR 4.6. We use Common Grams but queries with
 words in the CG list have slowed down. On debugging we found that for CG
 words the parser is adding individual tokens of those words in the query
 too which ends up slowing it. Below is an example:

 Query = only be

 Here is what debug shows. I have highlighted the red part which is
 different in both versions i.e. SOLR 4.6 is making it a multiphrasequery
 and adding individual tokens too. Can someone help?

 SOLR 4.6 (takes 20 secs)
 str name=rawquerystring{!surround}Contents:only be/str
 str name=querystring{!surround}Contents:only be/str
 str name=parsedqueryMultiPhraseQuery(Contents:(only only_be)
 be)/str
 str name=parsedquery_toStringContents:(only only_be) be/str

 SOLR 1.4.1 (takes 1 sec)
 str name=rawquerystring{!surround}Contents:only be/str
 str name=querystring{!surround}Contents:only be/str
 str name=parsedqueryContents:only_be/str
 str name=parsedquery_toStringContents:only_be/str--


 Regards,

 Salman Akram




-- 
Regards,

Salman Akram


Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2013-12-09 Thread Erik Hatcher
But again, as Ahmet mentioned… it doesn't look like the surround query parser 
is actually being used.   The debug output also mentioned the query parser 
used, but that part wasn't provided below.  One thing to note here, the 
surround query parser is not available in 1.4.1.   It also looks like you're 
surrounding your query with angle brackets, as it says query string is 
{!surround}Contents:only be, which is not correct syntax.  And one of the 
most important things to note here is that the surround query parser does NOT 
use the analysis chain of the field, see 
http://wiki.apache.org/solr/SurroundQueryParser#Limitations.  In short, 
you're going to have to do some work to get common grams factored into a 
surround query (such as maybe calling to the analysis request hander to parse 
the query before sending it to the surround query parser).

Erik


On Dec 9, 2013, at 9:36 AM, Salman Akram salman.ak...@northbaysolutions.net 
wrote:

 Yup on debugging I found that its coming in Analyzer. We are using Standard
 Analyzer. It seems to be a SOLR 4 issue with Common Grams. Not sure if its
 a bug or I am missing some config.
 
 
 On Mon, Dec 9, 2013 at 2:03 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
 Hi Salman,
 I am confused because with surround no analysis is applied at query time.
 I suspect that surround query parser is not kicking in. You should see
 SrndQuery or something like at parser query section.
 
 
 
 On Monday, December 9, 2013 6:24 AM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:
 
 All,
 
 I posted this sub-issue with another issue few days back but maybe it was
 not obvious so posting it on a separate thread.
 
 We recently migrated to SOLR 4.6. We use Common Grams but queries with
 words in the CG list have slowed down. On debugging we found that for CG
 words the parser is adding individual tokens of those words in the query
 too which ends up slowing it. Below is an example:
 
 Query = only be
 
 Here is what debug shows. I have highlighted the red part which is
 different in both versions i.e. SOLR 4.6 is making it a multiphrasequery
 and adding individual tokens too. Can someone help?
 
 SOLR 4.6 (takes 20 secs)
 str name=rawquerystring{!surround}Contents:only be/str
 str name=querystring{!surround}Contents:only be/str
 str name=parsedqueryMultiPhraseQuery(Contents:(only only_be)
 be)/str
 str name=parsedquery_toStringContents:(only only_be) be/str
 
 SOLR 1.4.1 (takes 1 sec)
 str name=rawquerystring{!surround}Contents:only be/str
 str name=querystring{!surround}Contents:only be/str
 str name=parsedqueryContents:only_be/str
 str name=parsedquery_toStringContents:only_be/str--
 
 
 Regards,
 
 Salman Akram
 
 
 
 
 -- 
 Regards,
 
 Salman Akram



Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2013-12-09 Thread Salman Akram
We used that syntax in 1.4.1 when Surround was not part of SOLR and has to
register it. Didn't know that it is now part of SOLR. Any ways this is a
red herring since I have totally removed Surround and the issue remains
there.

Below is the debug info when I give a simple phrase query having common
words with default Query Parser. What I don't understand is that why is it
including single tokens as well? I have also included the relevant config
part below.

rawquerystring: Contents:\only be\,
querystring: Contents:\only be\,
parsedquery: MultiPhraseQuery(Contents:\(only only_be) be\),
parsedquery_toString: Contents:\(only only_be) be\,

QParser: LuceneQParser,

=

fieldtype name=text class=solr.TextField
analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StandardFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.CommonGramsFilterFactory words=commonwords.txt
ignoreCase=true/
/analyzer
/fieldtype



On Mon, Dec 9, 2013 at 7:46 PM, Erik Hatcher erik.hatc...@gmail.com wrote:

 But again, as Ahmet mentioned… it doesn't look like the surround query
 parser is actually being used.   The debug output also mentioned the query
 parser used, but that part wasn't provided below.  One thing to note here,
 the surround query parser is not available in 1.4.1.   It also looks like
 you're surrounding your query with angle brackets, as it says query string
 is {!surround}Contents:only be, which is not correct syntax.  And one
 of the most important things to note here is that the surround query parser
 does NOT use the analysis chain of the field, see 
 http://wiki.apache.org/solr/SurroundQueryParser#Limitations.  In short,
 you're going to have to do some work to get common grams factored into a
 surround query (such as maybe calling to the analysis request hander to
 parse the query before sending it to the surround query parser).

 Erik


 On Dec 9, 2013, at 9:36 AM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:

  Yup on debugging I found that its coming in Analyzer. We are using
 Standard
  Analyzer. It seems to be a SOLR 4 issue with Common Grams. Not sure if
 its
  a bug or I am missing some config.
 
 
  On Mon, Dec 9, 2013 at 2:03 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi Salman,
  I am confused because with surround no analysis is applied at query
 time.
  I suspect that surround query parser is not kicking in. You should see
  SrndQuery or something like at parser query section.
 
 
 
  On Monday, December 9, 2013 6:24 AM, Salman Akram 
  salman.ak...@northbaysolutions.net wrote:
 
  All,
 
  I posted this sub-issue with another issue few days back but maybe it
 was
  not obvious so posting it on a separate thread.
 
  We recently migrated to SOLR 4.6. We use Common Grams but queries with
  words in the CG list have slowed down. On debugging we found that for CG
  words the parser is adding individual tokens of those words in the query
  too which ends up slowing it. Below is an example:
 
  Query = only be
 
  Here is what debug shows. I have highlighted the red part which is
  different in both versions i.e. SOLR 4.6 is making it a multiphrasequery
  and adding individual tokens too. Can someone help?
 
  SOLR 4.6 (takes 20 secs)
  str name=rawquerystring{!surround}Contents:only be/str
  str name=querystring{!surround}Contents:only be/str
  str name=parsedqueryMultiPhraseQuery(Contents:(only only_be)
  be)/str
  str name=parsedquery_toStringContents:(only only_be) be/str
 
  SOLR 1.4.1 (takes 1 sec)
  str name=rawquerystring{!surround}Contents:only be/str
  str name=querystring{!surround}Contents:only be/str
  str name=parsedqueryContents:only_be/str
  str name=parsedquery_toStringContents:only_be/str--
 
 
  Regards,
 
  Salman Akram
 
 
 
 
  --
  Regards,
 
  Salman Akram




-- 
Regards,

Salman Akram


Re: SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2013-12-09 Thread Ahmet Arslan
Hi Salman, 

I never used commons gram filer but I remember there are two classes in this 
family. CommonGramsFilter and CommonGramsQueryFilter. It seems that 
CommonsGramsQueryFilter is what you are after. 

http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsQueryFilter.html


http://khaidoan.wikidot.com/solr-common-gram-filter





On Tuesday, December 10, 2013 6:43 AM, Salman Akram 
salman.ak...@northbaysolutions.net wrote:
We used that syntax in 1.4.1 when Surround was not part of SOLR and has to
register it. Didn't know that it is now part of SOLR. Any ways this is a
red herring since I have totally removed Surround and the issue remains
there.

Below is the debug info when I give a simple phrase query having common
words with default Query Parser. What I don't understand is that why is it
including single tokens as well? I have also included the relevant config
part below.

rawquerystring: Contents:\only be\,
querystring: Contents:\only be\,
parsedquery: MultiPhraseQuery(Contents:\(only only_be) be\),
parsedquery_toString: Contents:\(only only_be) be\,

QParser: LuceneQParser,

=

fieldtype name=text class=solr.TextField
analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.CommonGramsFilterFactory words=commonwords.txt
ignoreCase=true/
/analyzer
/fieldtype



On Mon, Dec 9, 2013 at 7:46 PM, Erik Hatcher erik.hatc...@gmail.com wrote:

 But again, as Ahmet mentioned… it doesn't look like the surround query
 parser is actually being used.   The debug output also mentioned the query
 parser used, but that part wasn't provided below.  One thing to note here,
 the surround query parser is not available in 1.4.1.   It also looks like
 you're surrounding your query with angle brackets, as it says query string
 is {!surround}Contents:only be, which is not correct syntax.  And one
 of the most important things to note here is that the surround query parser
 does NOT use the analysis chain of the field, see 
 http://wiki.apache.org/solr/SurroundQueryParser#Limitations.  In short,
 you're going to have to do some work to get common grams factored into a
 surround query (such as maybe calling to the analysis request hander to
 parse the query before sending it to the surround query parser).

         Erik


 On Dec 9, 2013, at 9:36 AM, Salman Akram 
 salman.ak...@northbaysolutions.net wrote:

  Yup on debugging I found that its coming in Analyzer. We are using
 Standard
  Analyzer. It seems to be a SOLR 4 issue with Common Grams. Not sure if
 its
  a bug or I am missing some config.
 
 
  On Mon, Dec 9, 2013 at 2:03 PM, Ahmet Arslan iori...@yahoo.com wrote:
 
  Hi Salman,
  I am confused because with surround no analysis is applied at query
 time.
  I suspect that surround query parser is not kicking in. You should see
  SrndQuery or something like at parser query section.
 
 
 
  On Monday, December 9, 2013 6:24 AM, Salman Akram 
  salman.ak...@northbaysolutions.net wrote:
 
  All,
 
  I posted this sub-issue with another issue few days back but maybe it
 was
  not obvious so posting it on a separate thread.
 
  We recently migrated to SOLR 4.6. We use Common Grams but queries with
  words in the CG list have slowed down. On debugging we found that for CG
  words the parser is adding individual tokens of those words in the query
  too which ends up slowing it. Below is an example:
 
  Query = only be
 
  Here is what debug shows. I have highlighted the red part which is
  different in both versions i.e. SOLR 4.6 is making it a multiphrasequery
  and adding individual tokens too. Can someone help?
 
  SOLR 4.6 (takes 20 secs)
  str name=rawquerystring{!surround}Contents:only be/str
  str name=querystring{!surround}Contents:only be/str
  str name=parsedqueryMultiPhraseQuery(Contents:(only only_be)
  be)/str
  str name=parsedquery_toStringContents:(only only_be) be/str
 
  SOLR 1.4.1 (takes 1 sec)
  str name=rawquerystring{!surround}Contents:only be/str
  str name=querystring{!surround}Contents:only be/str
  str name=parsedqueryContents:only_be/str
  str name=parsedquery_toStringContents:only_be/str--
 
 
  Regards,

 
  Salman Akram
 
 
 
 
  --
  Regards,
 
  Salman Akram




-- 
Regards,

Salman Akram


SOLR 4 - Query Issue in Common Grams with Surround Query Parser

2013-12-08 Thread Salman Akram
All,

I posted this sub-issue with another issue few days back but maybe it was
not obvious so posting it on a separate thread.

We recently migrated to SOLR 4.6. We use Common Grams but queries with
words in the CG list have slowed down. On debugging we found that for CG
words the parser is adding individual tokens of those words in the query
too which ends up slowing it. Below is an example:

Query = only be

Here is what debug shows. I have highlighted the red part which is
different in both versions i.e. SOLR 4.6 is making it a multiphrasequery
and adding individual tokens too. Can someone help?

SOLR 4.6 (takes 20 secs)
str name=rawquerystring{!surround}Contents:only be/str
str name=querystring{!surround}Contents:only be/str
str name=parsedqueryMultiPhraseQuery(Contents:(only only_be) be)/str
str name=parsedquery_toStringContents:(only only_be) be/str

SOLR 1.4.1 (takes 1 sec)
str name=rawquerystring{!surround}Contents:only be/str
str name=querystring{!surround}Contents:only be/str
str name=parsedqueryContents:only_be/str
str name=parsedquery_toStringContents:only_be/str--


Regards,

Salman Akram