Re: Synonyms impacting the performance
two general comments on this thread as a whole... 1) it's hard to compare the timing of a query with no synonyms and a query with a lot of synonyms since the number of terms increases and (most likely) the number of documents matched in increases as well. the more clauses in the query, the more work that is done when executing it; the more docs that match each clause, etc... switching to index time synonyms is probably going to speed this up as much as it possibly can, becuase the number clauses isn'tgoing to expand (but you still have the added cost of "visiting" each matching doc. 2) it's not clear from this thread if the timing info you got involved the queryResultCache at all ... if you are using that cache in your performance tests, you may be seeing the effects of LUCENE-1415 (a bad hashCode implementation that resulted in cache misses) ... the more synonyms in your query, the more likely you are to get a cache miss even if the "same" query has already been executed. https://issues.apache.org/jira/browse/SOLR-805 https://issues.apache.org/jira/browse/LUCENE-1415 -Hoss
RE: Synonyms impacting the performance
Could you collaborate further? 20 synonyms would translated to 20 booleanQueries. Are you saying each booleanQuery requires a disk access? -Original Message- From: Walter Underwood [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 12, 2008 7:46 Joe To: solr-user@lucene.apache.org Subject: Re: Synonyms impacting the performance If there are twenty synonyms, then a one term query becomes a twenty term query, and that means 20X more disk accesses. wunder On 11/12/08 7:08 AM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote: > > On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote: >> I did the index time synonyms and results do look much better than >> the query time indexing. >> But is there a reason for the searches to be that slow. I understand >> that we have a pretty long list of synonyms (one word contains >> atleast 20 words as synonyms). Does this have such an adverse impact > > Apparently so :/ > > Are there other components in your request handler that may also be > (re)executing a query? Does the debugQuery=true component timings > point to any other bottlenecks? > > Erik >
RE: Synonyms impacting the performance
Yes there is a querycomponent which checks if there are any results based on a query and if the results are not present then modify the Boolean query. So this queryComponent is does call the process(). Thanks, Kalyan Manepalli -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 12, 2008 9:09 AM To: solr-user@lucene.apache.org Subject: Re: Synonyms impacting the performance On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote: > I did the index time synonyms and results do look much better > than the query time indexing. > But is there a reason for the searches to be that slow. I understand > that we have a pretty long list of synonyms (one word contains atleast > 20 words as synonyms). Does this have such an adverse impact Apparently so :/ Are there other components in your request handler that may also be (re)executing a query? Does the debugQuery=true component timings point to any other bottlenecks? Erik
Re: Synonyms impacting the performance
If there are twenty synonyms, then a one term query becomes a twenty term query, and that means 20X more disk accesses. wunder On 11/12/08 7:08 AM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote: > > On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote: >> I did the index time synonyms and results do look much better >> than the query time indexing. >> But is there a reason for the searches to be that slow. I understand >> that we have a pretty long list of synonyms (one word contains atleast >> 20 words as synonyms). Does this have such an adverse impact > > Apparently so :/ > > Are there other components in your request handler that may also be > (re)executing a query? Does the debugQuery=true component timings > point to any other bottlenecks? > > Erik >
Re: Synonyms impacting the performance
On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote: I did the index time synonyms and results do look much better than the query time indexing. But is there a reason for the searches to be that slow. I understand that we have a pretty long list of synonyms (one word contains atleast 20 words as synonyms). Does this have such an adverse impact Apparently so :/ Are there other components in your request handler that may also be (re)executing a query? Does the debugQuery=true component timings point to any other bottlenecks? Erik
RE: Synonyms impacting the performance
Hi Erik, I did the index time synonyms and results do look much better than the query time indexing. But is there a reason for the searches to be that slow. I understand that we have a pretty long list of synonyms (one word contains atleast 20 words as synonyms). Does this have such an adverse impact Thanks, Kalyan Manepalli -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 12, 2008 8:32 AM To: solr-user@lucene.apache.org Subject: Re: Synonyms impacting the performance On Nov 12, 2008, at 9:12 AM, Kashyap, Raghu wrote: > {quote}It's hard to tell where exactly the bottleneck is without > looking > at the server and a few other things. {quote} > > Can you suggest some areas where we can start looking into this issue? Using &debugQuery=true will output the timings of all the components - narrowing it down to the component would be the first step. My hunch is that you've got an enormous dismax query going on, and perhaps it is best to do index-time synonyms instead of query-time. Erik
Re: Synonyms impacting the performance
On Nov 12, 2008, at 9:12 AM, Kashyap, Raghu wrote: {quote}It's hard to tell where exactly the bottleneck is without looking at the server and a few other things. {quote} Can you suggest some areas where we can start looking into this issue? Using &debugQuery=true will output the timings of all the components - narrowing it down to the component would be the first step. My hunch is that you've got an enormous dismax query going on, and perhaps it is best to do index-time synonyms instead of query-time. Erik
RE: Synonyms impacting the performance
Hi Otis, {quote}It's hard to tell where exactly the bottleneck is without looking at the server and a few other things. {quote} Can you suggest some areas where we can start looking into this issue? -Raghu -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2008 10:55 PM To: solr-user@lucene.apache.org Subject: Re: Synonyms impacting the performance Yeah. Though, 20 seconds still sounds like crazy, not something that I'd expect from that not terribly complex and demanding query. It's hard to tell where exactly the bottleneck is without looking at the server and a few other things. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: Ryan McKinley <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, November 11, 2008 9:33:06 PM Subject: Re: Synonyms impacting the performance if performance is a problem, you can try adding the synonyms at index time... this should give you similar results without the runtime results. The obvious disadvantage is that you need to have the synonyms at index time... On Nov 11, 2008, at 2:37 PM, Manepalli, Kalyan wrote: > Hi Otis, > Since I have expand="true" in SynonymFilterFactory, the > DebugQuery shows the query is expanded with all the synonyms. > Eg; > Without synonym the query is: > Review:amaz | name:amaz | description:amaz > > With Synonym the query is: > (Review:amaz Review:fabul Review:improbable Review:incredible )| > name:amaz | (description:amaz description:fabul > description:improbable > description:incredible ) > > There are around 20 synonyms for each word. I am not sure if 20 > keywords > are causing the latency. > > Thanks, > Kalyan Manepalli > > -Original Message- > From: Manepalli, Kalyan [mailto:[EMAIL PROTECTED] > Sent: Tuesday, November 11, 2008 11:34 AM > To: solr-user@lucene.apache.org > Subject: RE: Synonyms impacting the performance > > Hi Otis, > I tested by taking out the newly added synonyms data and the query > time > was back to normal ~125ms. > I will verify the debugQuery and update you with the results > > Thanks, > Kalyan Manepalli > -Original Message- > From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] > Sent: Tuesday, November 11, 2008 11:26 AM > To: solr-user@lucene.apache.org > Subject: Re: Synonyms impacting the performance > > Hi, > > That doesn't sound normal, no. Do you know what your query looks like > after synonym expansion? (you can use debugQuery=true or peek at the > logs) Is that really the only thing that changed? In other words, if > you comment out the SynonymFactory in solrconfig.xml and restart > Solr do > things really go back to 125ms? > Are you seeing slowness for one particular query or for all of them? > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > From: "Manepalli, Kalyan" <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, November 11, 2008 11:58:37 AM > Subject: Synonyms impacting the performance > > Hi all, > >I recently implemented query time synonyms in my > application > and I am seeing drastic performance degradation. > > The synonyms file is counts around 1000 words. > > The average querytime without synonyms is around 125 ms and with > synonyms it jumps to 20 secs. > > Am I missing something here, since this doesn't look like a normal > behavior > > > > Any suggestions on this will be helpful > > > > Thanks, > > Kalyan Manepalli
Re: Synonyms impacting the performance
also check the timing in debugQuery=true... I suspect most of the time should be spent in: process: On Nov 11, 2008, at 12:33 PM, Manepalli, Kalyan wrote: Hi Otis, I tested by taking out the newly added synonyms data and the query time was back to normal ~125ms. I will verify the debugQuery and update you with the results Thanks, Kalyan Manepalli -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2008 11:26 AM To: solr-user@lucene.apache.org Subject: Re: Synonyms impacting the performance Hi, That doesn't sound normal, no. Do you know what your query looks like after synonym expansion? (you can use debugQuery=true or peek at the logs) Is that really the only thing that changed? In other words, if you comment out the SynonymFactory in solrconfig.xml and restart Solr do things really go back to 125ms? Are you seeing slowness for one particular query or for all of them? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: "Manepalli, Kalyan" <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, November 11, 2008 11:58:37 AM Subject: Synonyms impacting the performance Hi all, I recently implemented query time synonyms in my application and I am seeing drastic performance degradation. The synonyms file is counts around 1000 words. The average querytime without synonyms is around 125 ms and with synonyms it jumps to 20 secs. Am I missing something here, since this doesn't look like a normal behavior Any suggestions on this will be helpful Thanks, Kalyan Manepalli
Re: Synonyms impacting the performance
Yeah. Though, 20 seconds still sounds like crazy, not something that I'd expect from that not terribly complex and demanding query. It's hard to tell where exactly the bottleneck is without looking at the server and a few other things. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: Ryan McKinley <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, November 11, 2008 9:33:06 PM Subject: Re: Synonyms impacting the performance if performance is a problem, you can try adding the synonyms at index time... this should give you similar results without the runtime results. The obvious disadvantage is that you need to have the synonyms at index time... On Nov 11, 2008, at 2:37 PM, Manepalli, Kalyan wrote: > Hi Otis, > Since I have expand="true" in SynonymFilterFactory, the > DebugQuery shows the query is expanded with all the synonyms. > Eg; > Without synonym the query is: > Review:amaz | name:amaz | description:amaz > > With Synonym the query is: > (Review:amaz Review:fabul Review:improbable Review:incredible )| > name:amaz | (description:amaz description:fabul > description:improbable > description:incredible ) > > There are around 20 synonyms for each word. I am not sure if 20 > keywords > are causing the latency. > > Thanks, > Kalyan Manepalli > > -Original Message- > From: Manepalli, Kalyan [mailto:[EMAIL PROTECTED] > Sent: Tuesday, November 11, 2008 11:34 AM > To: solr-user@lucene.apache.org > Subject: RE: Synonyms impacting the performance > > Hi Otis, > I tested by taking out the newly added synonyms data and the query > time > was back to normal ~125ms. > I will verify the debugQuery and update you with the results > > Thanks, > Kalyan Manepalli > -Original Message- > From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] > Sent: Tuesday, November 11, 2008 11:26 AM > To: solr-user@lucene.apache.org > Subject: Re: Synonyms impacting the performance > > Hi, > > That doesn't sound normal, no. Do you know what your query looks like > after synonym expansion? (you can use debugQuery=true or peek at the > logs) Is that really the only thing that changed? In other words, if > you comment out the SynonymFactory in solrconfig.xml and restart > Solr do > things really go back to 125ms? > Are you seeing slowness for one particular query or for all of them? > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > From: "Manepalli, Kalyan" <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Tuesday, November 11, 2008 11:58:37 AM > Subject: Synonyms impacting the performance > > Hi all, > >I recently implemented query time synonyms in my > application > and I am seeing drastic performance degradation. > > The synonyms file is counts around 1000 words. > > The average querytime without synonyms is around 125 ms and with > synonyms it jumps to 20 secs. > > Am I missing something here, since this doesn't look like a normal > behavior > > > > Any suggestions on this will be helpful > > > > Thanks, > > Kalyan Manepalli
Re: Synonyms impacting the performance
if performance is a problem, you can try adding the synonyms at index time... this should give you similar results without the runtime results. The obvious disadvantage is that you need to have the synonyms at index time... On Nov 11, 2008, at 2:37 PM, Manepalli, Kalyan wrote: Hi Otis, Since I have expand="true" in SynonymFilterFactory, the DebugQuery shows the query is expanded with all the synonyms. Eg; Without synonym the query is: Review:amaz | name:amaz | description:amaz With Synonym the query is: (Review:amaz Review:fabul Review:improbable Review:incredible )| name:amaz | (description:amaz description:fabul description:improbable description:incredible ) There are around 20 synonyms for each word. I am not sure if 20 keywords are causing the latency. Thanks, Kalyan Manepalli -Original Message- From: Manepalli, Kalyan [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2008 11:34 AM To: solr-user@lucene.apache.org Subject: RE: Synonyms impacting the performance Hi Otis, I tested by taking out the newly added synonyms data and the query time was back to normal ~125ms. I will verify the debugQuery and update you with the results Thanks, Kalyan Manepalli -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2008 11:26 AM To: solr-user@lucene.apache.org Subject: Re: Synonyms impacting the performance Hi, That doesn't sound normal, no. Do you know what your query looks like after synonym expansion? (you can use debugQuery=true or peek at the logs) Is that really the only thing that changed? In other words, if you comment out the SynonymFactory in solrconfig.xml and restart Solr do things really go back to 125ms? Are you seeing slowness for one particular query or for all of them? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: "Manepalli, Kalyan" <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, November 11, 2008 11:58:37 AM Subject: Synonyms impacting the performance Hi all, I recently implemented query time synonyms in my application and I am seeing drastic performance degradation. The synonyms file is counts around 1000 words. The average querytime without synonyms is around 125 ms and with synonyms it jumps to 20 secs. Am I missing something here, since this doesn't look like a normal behavior Any suggestions on this will be helpful Thanks, Kalyan Manepalli
RE: Synonyms impacting the performance
Hi Otis, Since I have expand="true" in SynonymFilterFactory, the DebugQuery shows the query is expanded with all the synonyms. Eg; Without synonym the query is: Review:amaz | name:amaz | description:amaz With Synonym the query is: (Review:amaz Review:fabul Review:improbable Review:incredible )| name:amaz | (description:amaz description:fabul description:improbable description:incredible ) There are around 20 synonyms for each word. I am not sure if 20 keywords are causing the latency. Thanks, Kalyan Manepalli -Original Message- From: Manepalli, Kalyan [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2008 11:34 AM To: solr-user@lucene.apache.org Subject: RE: Synonyms impacting the performance Hi Otis, I tested by taking out the newly added synonyms data and the query time was back to normal ~125ms. I will verify the debugQuery and update you with the results Thanks, Kalyan Manepalli -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2008 11:26 AM To: solr-user@lucene.apache.org Subject: Re: Synonyms impacting the performance Hi, That doesn't sound normal, no. Do you know what your query looks like after synonym expansion? (you can use debugQuery=true or peek at the logs) Is that really the only thing that changed? In other words, if you comment out the SynonymFactory in solrconfig.xml and restart Solr do things really go back to 125ms? Are you seeing slowness for one particular query or for all of them? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: "Manepalli, Kalyan" <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, November 11, 2008 11:58:37 AM Subject: Synonyms impacting the performance Hi all, I recently implemented query time synonyms in my application and I am seeing drastic performance degradation. The synonyms file is counts around 1000 words. The average querytime without synonyms is around 125 ms and with synonyms it jumps to 20 secs. Am I missing something here, since this doesn't look like a normal behavior Any suggestions on this will be helpful Thanks, Kalyan Manepalli
RE: Synonyms impacting the performance
Hi Otis, I tested by taking out the newly added synonyms data and the query time was back to normal ~125ms. I will verify the debugQuery and update you with the results Thanks, Kalyan Manepalli -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 11, 2008 11:26 AM To: solr-user@lucene.apache.org Subject: Re: Synonyms impacting the performance Hi, That doesn't sound normal, no. Do you know what your query looks like after synonym expansion? (you can use debugQuery=true or peek at the logs) Is that really the only thing that changed? In other words, if you comment out the SynonymFactory in solrconfig.xml and restart Solr do things really go back to 125ms? Are you seeing slowness for one particular query or for all of them? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: "Manepalli, Kalyan" <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, November 11, 2008 11:58:37 AM Subject: Synonyms impacting the performance Hi all, I recently implemented query time synonyms in my application and I am seeing drastic performance degradation. The synonyms file is counts around 1000 words. The average querytime without synonyms is around 125 ms and with synonyms it jumps to 20 secs. Am I missing something here, since this doesn't look like a normal behavior Any suggestions on this will be helpful Thanks, Kalyan Manepalli
Re: Synonyms impacting the performance
Hi, That doesn't sound normal, no. Do you know what your query looks like after synonym expansion? (you can use debugQuery=true or peek at the logs) Is that really the only thing that changed? In other words, if you comment out the SynonymFactory in solrconfig.xml and restart Solr do things really go back to 125ms? Are you seeing slowness for one particular query or for all of them? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: "Manepalli, Kalyan" <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, November 11, 2008 11:58:37 AM Subject: Synonyms impacting the performance Hi all, I recently implemented query time synonyms in my application and I am seeing drastic performance degradation. The synonyms file is counts around 1000 words. The average querytime without synonyms is around 125 ms and with synonyms it jumps to 20 secs. Am I missing something here, since this doesn't look like a normal behavior Any suggestions on this will be helpful Thanks, Kalyan Manepalli