Re: Synonyms impacting the performance

2008-11-12 Thread Chris Hostetter

two general comments on this thread as a whole...

1) it's hard to compare the timing of a query with no synonyms and a query 
with a lot of synonyms since the number of terms increases and (most 
likely) the number of documents matched in increases as well.

the more clauses in the query, the more work that is done when executing 
it; the more docs that match each clause, etc...

switching to index time synonyms is probably going to speed this up as 
much as it possibly can, becuase the number clauses isn'tgoing to expand 
(but you still have the added cost of "visiting" each matching doc.

2) it's not clear from this thread if the timing info you got involved the 
queryResultCache at all ... if you are using that cache in your 
performance tests, you may be seeing the effects of LUCENE-1415 (a bad 
hashCode implementation that resulted in cache misses) ... the more 
synonyms in your query, the more likely you are to get a cache miss even 
if the "same" query has already been executed.


https://issues.apache.org/jira/browse/SOLR-805
https://issues.apache.org/jira/browse/LUCENE-1415






-Hoss



RE: Synonyms impacting the performance

2008-11-12 Thread Nguyen, Joe
Could you collaborate further?  20 synonyms would translated to 20
booleanQueries.  Are you saying each booleanQuery requires a disk
access? 

-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 12, 2008 7:46 Joe
To: solr-user@lucene.apache.org
Subject: Re: Synonyms impacting the performance

If there are twenty synonyms, then a one term query becomes a twenty
term query, and that means 20X more disk accesses.

wunder

On 11/12/08 7:08 AM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote:

> 
> On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote:
>> I did the index time synonyms and results do look much better than 
>> the query time indexing.
>> But is there a reason for the searches to be that slow. I understand 
>> that we have a pretty long list of synonyms (one word contains 
>> atleast 20 words as synonyms). Does this have such an adverse impact
> 
> Apparently so :/
> 
> Are there other components in your request handler that may also be
> (re)executing a query?   Does the debugQuery=true component timings
> point to any other bottlenecks?
> 
> Erik
> 



RE: Synonyms impacting the performance

2008-11-12 Thread Manepalli, Kalyan
Yes there is a querycomponent which checks if there are any results
based on a query and if the results are not present then modify the
Boolean query.

So this queryComponent is does call the process().

Thanks,
Kalyan Manepalli

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 12, 2008 9:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms impacting the performance


On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote:
>   I did the index time synonyms and results do look much better
> than the query time indexing.
> But is there a reason for the searches to be that slow. I understand
> that we have a pretty long list of synonyms (one word contains atleast
> 20 words as synonyms). Does this have such an adverse impact

Apparently so :/

Are there other components in your request handler that may also be  
(re)executing a query?   Does the debugQuery=true component timings  
point to any other bottlenecks?

Erik



Re: Synonyms impacting the performance

2008-11-12 Thread Walter Underwood
If there are twenty synonyms, then a one term query becomes a
twenty term query, and that means 20X more disk accesses.

wunder

On 11/12/08 7:08 AM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote:

> 
> On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote:
>> I did the index time synonyms and results do look much better
>> than the query time indexing.
>> But is there a reason for the searches to be that slow. I understand
>> that we have a pretty long list of synonyms (one word contains atleast
>> 20 words as synonyms). Does this have such an adverse impact
> 
> Apparently so :/
> 
> Are there other components in your request handler that may also be
> (re)executing a query?   Does the debugQuery=true component timings
> point to any other bottlenecks?
> 
> Erik
> 



Re: Synonyms impacting the performance

2008-11-12 Thread Erik Hatcher


On Nov 12, 2008, at 9:41 AM, Manepalli, Kalyan wrote:

I did the index time synonyms and results do look much better
than the query time indexing.
But is there a reason for the searches to be that slow. I understand
that we have a pretty long list of synonyms (one word contains atleast
20 words as synonyms). Does this have such an adverse impact


Apparently so :/

Are there other components in your request handler that may also be  
(re)executing a query?   Does the debugQuery=true component timings  
point to any other bottlenecks?


Erik



RE: Synonyms impacting the performance

2008-11-12 Thread Manepalli, Kalyan
Hi Erik,
I did the index time synonyms and results do look much better
than the query time indexing. 
But is there a reason for the searches to be that slow. I understand
that we have a pretty long list of synonyms (one word contains atleast
20 words as synonyms). Does this have such an adverse impact

Thanks,
Kalyan Manepalli

-Original Message-
From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 12, 2008 8:32 AM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms impacting the performance


On Nov 12, 2008, at 9:12 AM, Kashyap, Raghu wrote:
> {quote}It's hard to tell where exactly the bottleneck is without  
> looking
> at the server and a few other things. {quote}
>
> Can you suggest some areas where we can start looking into this issue?

Using &debugQuery=true will output the timings of all the components -  
narrowing it down to the component would be the first step.  My hunch  
is that you've got an enormous dismax query going on, and perhaps it  
is best to do index-time synonyms instead of query-time.

Erik



Re: Synonyms impacting the performance

2008-11-12 Thread Erik Hatcher


On Nov 12, 2008, at 9:12 AM, Kashyap, Raghu wrote:
{quote}It's hard to tell where exactly the bottleneck is without  
looking

at the server and a few other things. {quote}

Can you suggest some areas where we can start looking into this issue?


Using &debugQuery=true will output the timings of all the components -  
narrowing it down to the component would be the first step.  My hunch  
is that you've got an enormous dismax query going on, and perhaps it  
is best to do index-time synonyms instead of query-time.


Erik



RE: Synonyms impacting the performance

2008-11-12 Thread Kashyap, Raghu
Hi Otis,

{quote}It's hard to tell where exactly the bottleneck is without looking
at the server and a few other things. {quote}

Can you suggest some areas where we can start looking into this issue?

-Raghu
  

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 11, 2008 10:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms impacting the performance
Yeah.  Though, 20 seconds still sounds like crazy, not something that
I'd expect from that not terribly complex and demanding query.  It's
hard to tell where exactly the bottleneck is without looking at the
server and a few other things.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: Ryan McKinley <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, November 11, 2008 9:33:06 PM
Subject: Re: Synonyms impacting the performance

if performance is a problem, you can try adding the synonyms at index  
time... this should give you similar results without the runtime  
results.

The obvious disadvantage is that you need to have the synonyms at  
index time...


On Nov 11, 2008, at 2:37 PM, Manepalli, Kalyan wrote:

> Hi Otis,
> Since I have expand="true" in SynonymFilterFactory, the
> DebugQuery shows the query is expanded with all the synonyms.
> Eg;
> Without synonym the query is:
> Review:amaz | name:amaz | description:amaz
>
> With Synonym the query is:
> (Review:amaz Review:fabul  Review:improbable  Review:incredible )|
> name:amaz | (description:amaz description:fabul  
> description:improbable
> description:incredible )
>
> There are around 20 synonyms for each word. I am not sure if 20  
> keywords
> are causing the latency.
>
> Thanks,
> Kalyan Manepalli
>
> -Original Message-
> From: Manepalli, Kalyan [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, November 11, 2008 11:34 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Synonyms impacting the performance
>
> Hi Otis,
> I tested by taking out the newly added synonyms data and the query  
> time
> was back to normal ~125ms.
> I will verify the debugQuery and update you with the results
>
> Thanks,
> Kalyan Manepalli
> -Original Message-
> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, November 11, 2008 11:26 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Synonyms impacting the performance
>
> Hi,
>
> That doesn't sound normal, no.  Do you know what your query looks like
> after synonym expansion? (you can use debugQuery=true or peek at the
> logs)  Is that really the only thing that changed?  In other words, if
> you comment out the SynonymFactory in solrconfig.xml and restart  
> Solr do
> things really go back to 125ms?
> Are you seeing slowness for one particular query or for all of them?
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
>
> 
> From: "Manepalli, Kalyan" <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, November 11, 2008 11:58:37 AM
> Subject: Synonyms impacting the performance
>
> Hi all,
>
>I recently implemented query time synonyms in my  
> application
> and I am seeing drastic performance degradation.
>
> The synonyms file is counts around 1000 words.
>
> The average querytime without synonyms is around 125 ms and with
> synonyms it jumps to 20 secs.
>
> Am I missing something here, since this doesn't look like a normal
> behavior
>
>
>
> Any suggestions on this will be helpful
>
>
>
> Thanks,
>
> Kalyan Manepalli


Re: Synonyms impacting the performance

2008-11-11 Thread Ryan McKinley

also check the timing in debugQuery=true...

I suspect most of the time should be spent in:
process:



On Nov 11, 2008, at 12:33 PM, Manepalli, Kalyan wrote:


Hi Otis,
I tested by taking out the newly added synonyms data and the query  
time

was back to normal ~125ms.
I will verify the debugQuery and update you with the results

Thanks,
Kalyan Manepalli
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 11, 2008 11:26 AM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms impacting the performance

Hi,

That doesn't sound normal, no.  Do you know what your query looks like
after synonym expansion? (you can use debugQuery=true or peek at the
logs)  Is that really the only thing that changed?  In other words, if
you comment out the SynonymFactory in solrconfig.xml and restart  
Solr do

things really go back to 125ms?
Are you seeing slowness for one particular query or for all of them?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: "Manepalli, Kalyan" <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, November 11, 2008 11:58:37 AM
Subject: Synonyms impacting the performance

Hi all,

   I recently implemented query time synonyms in my  
application

and I am seeing drastic performance degradation.

The synonyms file is counts around 1000 words.

The average querytime without synonyms is around 125 ms and with
synonyms it jumps to 20 secs.

Am I missing something here, since this doesn't look like a normal
behavior



Any suggestions on this will be helpful



Thanks,

Kalyan Manepalli




Re: Synonyms impacting the performance

2008-11-11 Thread Otis Gospodnetic
Yeah.  Though, 20 seconds still sounds like crazy, not something that I'd 
expect from that not terribly complex and demanding query.  It's hard to tell 
where exactly the bottleneck is without looking at the server and a few other 
things.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: Ryan McKinley <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, November 11, 2008 9:33:06 PM
Subject: Re: Synonyms impacting the performance

if performance is a problem, you can try adding the synonyms at index  
time... this should give you similar results without the runtime  
results.

The obvious disadvantage is that you need to have the synonyms at  
index time...


On Nov 11, 2008, at 2:37 PM, Manepalli, Kalyan wrote:

> Hi Otis,
> Since I have expand="true" in SynonymFilterFactory, the
> DebugQuery shows the query is expanded with all the synonyms.
> Eg;
> Without synonym the query is:
> Review:amaz | name:amaz | description:amaz
>
> With Synonym the query is:
> (Review:amaz Review:fabul  Review:improbable  Review:incredible )|
> name:amaz | (description:amaz description:fabul  
> description:improbable
> description:incredible )
>
> There are around 20 synonyms for each word. I am not sure if 20  
> keywords
> are causing the latency.
>
> Thanks,
> Kalyan Manepalli
>
> -Original Message-
> From: Manepalli, Kalyan [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, November 11, 2008 11:34 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Synonyms impacting the performance
>
> Hi Otis,
> I tested by taking out the newly added synonyms data and the query  
> time
> was back to normal ~125ms.
> I will verify the debugQuery and update you with the results
>
> Thanks,
> Kalyan Manepalli
> -Original Message-
> From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, November 11, 2008 11:26 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Synonyms impacting the performance
>
> Hi,
>
> That doesn't sound normal, no.  Do you know what your query looks like
> after synonym expansion? (you can use debugQuery=true or peek at the
> logs)  Is that really the only thing that changed?  In other words, if
> you comment out the SynonymFactory in solrconfig.xml and restart  
> Solr do
> things really go back to 125ms?
> Are you seeing slowness for one particular query or for all of them?
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
>
> 
> From: "Manepalli, Kalyan" <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, November 11, 2008 11:58:37 AM
> Subject: Synonyms impacting the performance
>
> Hi all,
>
>I recently implemented query time synonyms in my  
> application
> and I am seeing drastic performance degradation.
>
> The synonyms file is counts around 1000 words.
>
> The average querytime without synonyms is around 125 ms and with
> synonyms it jumps to 20 secs.
>
> Am I missing something here, since this doesn't look like a normal
> behavior
>
>
>
> Any suggestions on this will be helpful
>
>
>
> Thanks,
>
> Kalyan Manepalli

Re: Synonyms impacting the performance

2008-11-11 Thread Ryan McKinley
if performance is a problem, you can try adding the synonyms at index  
time... this should give you similar results without the runtime  
results.


The obvious disadvantage is that you need to have the synonyms at  
index time...



On Nov 11, 2008, at 2:37 PM, Manepalli, Kalyan wrote:


Hi Otis,
Since I have expand="true" in SynonymFilterFactory, the
DebugQuery shows the query is expanded with all the synonyms.
Eg;
Without synonym the query is:
Review:amaz | name:amaz | description:amaz

With Synonym the query is:
(Review:amaz Review:fabul  Review:improbable  Review:incredible )|
name:amaz | (description:amaz description:fabul   
description:improbable

description:incredible )

There are around 20 synonyms for each word. I am not sure if 20  
keywords

are causing the latency.

Thanks,
Kalyan Manepalli

-Original Message-
From: Manepalli, Kalyan [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 11, 2008 11:34 AM
To: solr-user@lucene.apache.org
Subject: RE: Synonyms impacting the performance

Hi Otis,
I tested by taking out the newly added synonyms data and the query  
time

was back to normal ~125ms.
I will verify the debugQuery and update you with the results

Thanks,
Kalyan Manepalli
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 11, 2008 11:26 AM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms impacting the performance

Hi,

That doesn't sound normal, no.  Do you know what your query looks like
after synonym expansion? (you can use debugQuery=true or peek at the
logs)  Is that really the only thing that changed?  In other words, if
you comment out the SynonymFactory in solrconfig.xml and restart  
Solr do

things really go back to 125ms?
Are you seeing slowness for one particular query or for all of them?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: "Manepalli, Kalyan" <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, November 11, 2008 11:58:37 AM
Subject: Synonyms impacting the performance

Hi all,

   I recently implemented query time synonyms in my  
application

and I am seeing drastic performance degradation.

The synonyms file is counts around 1000 words.

The average querytime without synonyms is around 125 ms and with
synonyms it jumps to 20 secs.

Am I missing something here, since this doesn't look like a normal
behavior



Any suggestions on this will be helpful



Thanks,

Kalyan Manepalli




RE: Synonyms impacting the performance

2008-11-11 Thread Manepalli, Kalyan
Hi Otis,
Since I have expand="true" in SynonymFilterFactory, the
DebugQuery shows the query is expanded with all the synonyms.
Eg;
Without synonym the query is:
Review:amaz | name:amaz | description:amaz

With Synonym the query is:
(Review:amaz Review:fabul  Review:improbable  Review:incredible )|
name:amaz | (description:amaz description:fabul  description:improbable
description:incredible )

There are around 20 synonyms for each word. I am not sure if 20 keywords
are causing the latency.

Thanks,
Kalyan Manepalli

-Original Message-
From: Manepalli, Kalyan [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 11, 2008 11:34 AM
To: solr-user@lucene.apache.org
Subject: RE: Synonyms impacting the performance

Hi Otis,
I tested by taking out the newly added synonyms data and the query time
was back to normal ~125ms.
I will verify the debugQuery and update you with the results

Thanks,
Kalyan Manepalli
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 11, 2008 11:26 AM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms impacting the performance

Hi,

That doesn't sound normal, no.  Do you know what your query looks like
after synonym expansion? (you can use debugQuery=true or peek at the
logs)  Is that really the only thing that changed?  In other words, if
you comment out the SynonymFactory in solrconfig.xml and restart Solr do
things really go back to 125ms?
Are you seeing slowness for one particular query or for all of them?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: "Manepalli, Kalyan" <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, November 11, 2008 11:58:37 AM
Subject: Synonyms impacting the performance

Hi all,

I recently implemented query time synonyms in my application
and I am seeing drastic performance degradation.

The synonyms file is counts around 1000 words.

The average querytime without synonyms is around 125 ms and with
synonyms it jumps to 20 secs.

Am I missing something here, since this doesn't look like a normal
behavior



Any suggestions on this will be helpful



Thanks,

Kalyan Manepalli


RE: Synonyms impacting the performance

2008-11-11 Thread Manepalli, Kalyan
Hi Otis,
I tested by taking out the newly added synonyms data and the query time
was back to normal ~125ms.
I will verify the debugQuery and update you with the results

Thanks,
Kalyan Manepalli
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 11, 2008 11:26 AM
To: solr-user@lucene.apache.org
Subject: Re: Synonyms impacting the performance

Hi,

That doesn't sound normal, no.  Do you know what your query looks like
after synonym expansion? (you can use debugQuery=true or peek at the
logs)  Is that really the only thing that changed?  In other words, if
you comment out the SynonymFactory in solrconfig.xml and restart Solr do
things really go back to 125ms?
Are you seeing slowness for one particular query or for all of them?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: "Manepalli, Kalyan" <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, November 11, 2008 11:58:37 AM
Subject: Synonyms impacting the performance

Hi all,

I recently implemented query time synonyms in my application
and I am seeing drastic performance degradation.

The synonyms file is counts around 1000 words.

The average querytime without synonyms is around 125 ms and with
synonyms it jumps to 20 secs.

Am I missing something here, since this doesn't look like a normal
behavior



Any suggestions on this will be helpful



Thanks,

Kalyan Manepalli


Re: Synonyms impacting the performance

2008-11-11 Thread Otis Gospodnetic
Hi,

That doesn't sound normal, no.  Do you know what your query looks like after 
synonym expansion? (you can use debugQuery=true or peek at the logs)  Is that 
really the only thing that changed?  In other words, if you comment out the 
SynonymFactory in solrconfig.xml and restart Solr do things really go back to 
125ms?
Are you seeing slowness for one particular query or for all of them?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





From: "Manepalli, Kalyan" <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, November 11, 2008 11:58:37 AM
Subject: Synonyms impacting the performance

Hi all,

I recently implemented query time synonyms in my application
and I am seeing drastic performance degradation.

The synonyms file is counts around 1000 words.

The average querytime without synonyms is around 125 ms and with
synonyms it jumps to 20 secs.

Am I missing something here, since this doesn't look like a normal
behavior



Any suggestions on this will be helpful



Thanks,

Kalyan Manepalli