Re: Reverse querying

2009-06-24 Thread oleg_gnatovskiy



AlexElba wrote:
> 
> Hello,
> 
> I have problem which I am trying to solve using solr.
> 
> I have search text (term) and I have index full of words which are mapped
> to ids.
> 
> Is there any query that I can run to do this?
> 
> Example:
> 
> Term
> "3) A recommendation to use VAR=value in the configure command line will
>  not work with some 'configure' scripts that comply to GNU standards
>  but are not generated by autoconf. "
> 
>  Index docs
> 
> id:1 name:recommendation 
> ...
> id:3 name:GNU
> id:4 name food
> 
> after running "query" I want to get as results 1 and 3 
> 
> Thanks
> 
> 

I am very curious about this issue because I am doing similar project 

-- 
View this message in context: 
http://www.nabble.com/Reverse-querying-tp24194777p24195153.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Random queries extremely slow

2009-04-14 Thread oleg_gnatovskiy

It was actually our use of the field collapse patch. Once we disabled this
the random slow queries went away. 

We also added *:* as a warmup query in order to speed up performance after
indexing.



sunnyfr wrote:
> 
> Hi Oleg
> 
> Did you find a way to pass over this issue ?? 
> thanks a lot,
> 
> 
> oleg_gnatovskiy wrote:
>> 
>> Can you expand on this? Mirroring delay on what?
>> 
>> 
>> 
>> zayhen wrote:
>>> 
>>> Use multiple boxes, with a mirroring delaay from one to another, like a
>>> pipeline.
>>> 
>>> 2009/1/22 oleg_gnatovskiy 
>>> 
>>>>
>>>> Well this probably isn't the cause of our random slow queries, but
>>>> might be
>>>> the cause of the slow queries after pulling a new index. Is there
>>>> anything
>>>> we could do to reduce the performance hit we take from this happening?
>>>>
>>>>
>>>>
>>>> Otis Gospodnetic wrote:
>>>> >
>>>> > Here is one example: pushing a large newly optimized index onto the
>>>> > server.
>>>> >
>>>> > Otis
>>>> > --
>>>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>> >
>>>> >
>>>> >
>>>> > - Original Message 
>>>> >> From: oleg_gnatovskiy 
>>>> >> To: solr-user@lucene.apache.org
>>>> >> Sent: Thursday, January 22, 2009 2:22:51 PM
>>>> >> Subject: Re: Random queries extremely slow
>>>> >>
>>>> >>
>>>> >> What are some things that could happen to force files out of the
>>>> cache
>>>> on
>>>> >> a
>>>> >> Linux machine? I don't know what kinds of events to look for...
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> yonik wrote:
>>>> >> >
>>>> >> > On Thu, Jan 22, 2009 at 1:46 PM, oleg_gnatovskiy
>>>> >> > wrote:
>>>> >> >> Hello. Our production servers are operating relatively smoothly
>>>> most
>>>> >> of
>>>> >> >> the
>>>> >> >> time running Solr with 19 million listings. However every once in
>>>> a
>>>> >> while
>>>> >> >> the same query that used to take 100 miliseconds takes 6000.
>>>> >> >
>>>> >> > Anything else happening on the system that may have forced some of
>>>> the
>>>> >> > index files out of operating system disk cache at these times?
>>>> >> >
>>>> >> > -Yonik
>>>> >> >
>>>> >> >
>>>> >>
>>>> >> --
>>>> >> View this message in context:
>>>> >>
>>>> http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21611240.html
>>>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>>>> >
>>>> >
>>>> >
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21611454.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>> 
>>> 
>>> -- 
>>> Alexander Ramos Jardim
>>> 
>>> 
>>> -
>>> RPG da Ilha 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Random-queries-extremely-slow-tp21610568p23043152.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2009-01-26 Thread oleg_gnatovskiy

Just to clarify - we do not optimize on the slaves at all. We only optimize
on the master.

hossman wrote:
> 
> 
> : We do optimize the index before updates but we get tehse performance
> issues
> : even when we pull an empty snapshot. Thus even when our update is tiny,
> the
> : performance issues still happen.
> 
> FWIW: this behavior doesn't make a lot of sense -- optimizing just 
> before you are about to make updates/additions ot your data, is a complete 
> waste.  the main value in optimizing your index is that you have one 
> segment, as soon as you add a docment that changes.
> 
> the other thing to keep in mind is that an optimized index is a completley 
> new segment as a new file with a new name, so there is going to be added 
> overhead on the slave machines as the OS purges the old index files and 
> replaces them with the new optimized index files -- more overhead then if 
> you had just done your additions w/o optimizing first.
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21678267.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2009-01-26 Thread oleg_gnatovskiy

Just to calrify - we do not optimize on teh slaves at all. We only optimize
on the master.

hossman wrote:
> 
> 
> : We do optimize the index before updates but we get tehse performance
> issues
> : even when we pull an empty snapshot. Thus even when our update is tiny,
> the
> : performance issues still happen.
> 
> FWIW: this behavior doesn't make a lot of sense -- optimizing just 
> before you are about to make updates/additions ot your data, is a complete 
> waste.  the main value in optimizing your index is that you have one 
> segment, as soon as you add a docment that changes.
> 
> the other thing to keep in mind is that an optimized index is a completley 
> new segment as a new file with a new name, so there is going to be added 
> overhead on the slave machines as the OS purges the old index files and 
> replaces them with the new optimized index files -- more overhead then if 
> you had just done your additions w/o optimizing first.
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21678261.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2009-01-26 Thread oleg_gnatovskiy

Just to calrify - we do not optimize on the slaves at all. We only optimize
on the master.

hossman wrote:
> 
> 
> : We do optimize the index before updates but we get tehse performance
> issues
> : even when we pull an empty snapshot. Thus even when our update is tiny,
> the
> : performance issues still happen.
> 
> FWIW: this behavior doesn't make a lot of sense -- optimizing just 
> before you are about to make updates/additions ot your data, is a complete 
> waste.  the main value in optimizing your index is that you have one 
> segment, as soon as you add a docment that changes.
> 
> the other thing to keep in mind is that an optimized index is a completley 
> new segment as a new file with a new name, so there is going to be added 
> overhead on the slave machines as the OS purges the old index files and 
> replaces them with the new optimized index files -- more overhead then if 
> you had just done your additions w/o optimizing first.
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21678265.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Random queries extremely slow

2009-01-26 Thread oleg_gnatovskiy

Can you expand on this? Mirroring delay on what?



zayhen wrote:
> 
> Use multiple boxes, with a mirroring delaay from one to another, like a
> pipeline.
> 
> 2009/1/22 oleg_gnatovskiy 
> 
>>
>> Well this probably isn't the cause of our random slow queries, but might
>> be
>> the cause of the slow queries after pulling a new index. Is there
>> anything
>> we could do to reduce the performance hit we take from this happening?
>>
>>
>>
>> Otis Gospodnetic wrote:
>> >
>> > Here is one example: pushing a large newly optimized index onto the
>> > server.
>> >
>> > Otis
>> > --
>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> >
>> >
>> >
>> > - Original Message 
>> >> From: oleg_gnatovskiy 
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Thursday, January 22, 2009 2:22:51 PM
>> >> Subject: Re: Random queries extremely slow
>> >>
>> >>
>> >> What are some things that could happen to force files out of the cache
>> on
>> >> a
>> >> Linux machine? I don't know what kinds of events to look for...
>> >>
>> >>
>> >>
>> >>
>> >> yonik wrote:
>> >> >
>> >> > On Thu, Jan 22, 2009 at 1:46 PM, oleg_gnatovskiy
>> >> > wrote:
>> >> >> Hello. Our production servers are operating relatively smoothly
>> most
>> >> of
>> >> >> the
>> >> >> time running Solr with 19 million listings. However every once in a
>> >> while
>> >> >> the same query that used to take 100 miliseconds takes 6000.
>> >> >
>> >> > Anything else happening on the system that may have forced some of
>> the
>> >> > index files out of operating system disk cache at these times?
>> >> >
>> >> > -Yonik
>> >> >
>> >> >
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21611240.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21611454.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Alexander Ramos Jardim
> 
> 
> -
> RPG da Ilha 
> 

-- 
View this message in context: 
http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21670023.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2009-01-22 Thread oleg_gnatovskiy

We've tried it. There doesn't seem to be any connection between GC and the
bad performance spikes.


Otis Gospodnetic wrote:
> 
> OK.  Then it's likely not this.  You saw the other response about looking
> at GC to see if maybe that hits you once in a while and slows whatever
> queries are in flight?  Try jconsole.
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: oleg_gnatovskiy 
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, January 22, 2009 2:43:31 PM
>> Subject: Re: Query Performance while updating teh index
>> 
>> 
>> We do optimize the index before updates but we get tehse performance
>> issues
>> even when we pull an empty snapshot. Thus even when our update is tiny,
>> the
>> performance issues still happen.
>> 
>> 
>> 
>> Otis Gospodnetic wrote:
>> > 
>> > This is an old and long thread, and I no longer recall what the
>> specific
>> > suggestions were.
>> > My guess is this has to do with the OS cache of your index files.  When
>> > you make the large index update, that OS cache is useless (old files
>> are
>> > gone, new ones are in) and the OS cache has get re-warmed and this
>> takes
>> > time.
>> > 
>> > Are you optimizing your index before the update?  Do you *really* need
>> to
>> > do that?
>> > How large is your update, what makes it big, and could you make it
>> > smaller?
>> > 
>> > Otis
>> > --
>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> > 
>> > 
>> > 
>> > - Original Message 
>> >> From: oleg_gnatovskiy 
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Tuesday, January 20, 2009 6:19:46 PM
>> >> Subject: Re: Query Performance while updating teh index
>> >> 
>> >> 
>> >> Hello again. It seems that we are still having these problems. Queries
>> >> take
>> >> as long as 20 minutes to get back to their average response time after
>> a
>> >> large index update, so it doesn't seem like the problem is the 12
>> second
>> >> autowarm time. Are there any more suggestions for things we can try?
>> >> Taking
>> >> our servers out of teh loop for as long as 20 minutes is a bit of a
>> >> hassle,
>> >> and a risk.
>> >> -- 
>> >> View this message in context: 
>> >> 
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> > 
>> > 
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21611642.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21611976.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2009-01-22 Thread oleg_gnatovskiy

We do optimize the index before updates but we get tehse performance issues
even when we pull an empty snapshot. Thus even when our update is tiny, the
performance issues still happen.



Otis Gospodnetic wrote:
> 
> This is an old and long thread, and I no longer recall what the specific
> suggestions were.
> My guess is this has to do with the OS cache of your index files.  When
> you make the large index update, that OS cache is useless (old files are
> gone, new ones are in) and the OS cache has get re-warmed and this takes
> time.
> 
> Are you optimizing your index before the update?  Do you *really* need to
> do that?
> How large is your update, what makes it big, and could you make it
> smaller?
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: oleg_gnatovskiy 
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, January 20, 2009 6:19:46 PM
>> Subject: Re: Query Performance while updating teh index
>> 
>> 
>> Hello again. It seems that we are still having these problems. Queries
>> take
>> as long as 20 minutes to get back to their average response time after a
>> large index update, so it doesn't seem like the problem is the 12 second
>> autowarm time. Are there any more suggestions for things we can try?
>> Taking
>> our servers out of teh loop for as long as 20 minutes is a bit of a
>> hassle,
>> and a risk.
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21611642.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Random queries extremely slow

2009-01-22 Thread oleg_gnatovskiy

Well this probably isn't the cause of our random slow queries, but might be
the cause of the slow queries after pulling a new index. Is there anything
we could do to reduce the performance hit we take from this happening?



Otis Gospodnetic wrote:
> 
> Here is one example: pushing a large newly optimized index onto the
> server.
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: oleg_gnatovskiy 
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, January 22, 2009 2:22:51 PM
>> Subject: Re: Random queries extremely slow
>> 
>> 
>> What are some things that could happen to force files out of the cache on
>> a
>> Linux machine? I don't know what kinds of events to look for...
>> 
>> 
>> 
>> 
>> yonik wrote:
>> > 
>> > On Thu, Jan 22, 2009 at 1:46 PM, oleg_gnatovskiy
>> > wrote:
>> >> Hello. Our production servers are operating relatively smoothly most
>> of
>> >> the
>> >> time running Solr with 19 million listings. However every once in a
>> while
>> >> the same query that used to take 100 miliseconds takes 6000.
>> > 
>> > Anything else happening on the system that may have forced some of the
>> > index files out of operating system disk cache at these times?
>> > 
>> > -Yonik
>> > 
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21611240.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21611454.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Random queries extremely slow

2009-01-22 Thread oleg_gnatovskiy

What are some things that could happen to force files out of the cache on a
Linux machine? I don't know what kinds of events to look for...




yonik wrote:
> 
> On Thu, Jan 22, 2009 at 1:46 PM, oleg_gnatovskiy
>  wrote:
>> Hello. Our production servers are operating relatively smoothly most of
>> the
>> time running Solr with 19 million listings. However every once in a while
>> the same query that used to take 100 miliseconds takes 6000.
> 
> Anything else happening on the system that may have forced some of the
> index files out of operating system disk cache at these times?
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21611240.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Random queries extremely slow

2009-01-22 Thread oleg_gnatovskiy

Actually my issue might merit a seperate discussion as I did tuning by
adjusting the heap to different settings to see how it affected changed. It
really had no affect, as with jdk 1.6, garbage collection is parallel which
now should no longer interfere with requests during garbage collection which
holds true based on the tests we ran.



oleg_gnatovskiy wrote:
> 
> My aplogies, this is likely the same issue as "Intermittent high response
> times  by  hbi dev "
> 
> 
> 
> oleg_gnatovskiy wrote:
>> 
>> Hello. Our production servers are operating relatively smoothly most of
>> the time running Solr with 19 million listings. However every once in a
>> while the same query that used to take 100 miliseconds takes 6000. This
>> causes out health check to fail, and the server is taken out of service.
>> Once the server is put back in service, queries are back to their regular
>> response times. Is there anything we could do to stop this random
>> slowness from occurring? 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21610972.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Random queries extremely slow

2009-01-22 Thread oleg_gnatovskiy

My aplogies, this is likely the same issue as "Intermittent high response
times  by  hbi dev "



oleg_gnatovskiy wrote:
> 
> Hello. Our production servers are operating relatively smoothly most of
> the time running Solr with 19 million listings. However every once in a
> while the same query that used to take 100 miliseconds takes 6000. This
> causes out health check to fail, and the server is taken out of service.
> Once the server is put back in service, queries are back to their regular
> response times. Is there anything we could do to stop this random slowness
> from occurring? 
> 

-- 
View this message in context: 
http://www.nabble.com/Random-queries-extremely-slow-tp21610568p21610660.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2009-01-21 Thread oleg_gnatovskiy

What exactly does Solr do when it receives a new Index? How does it keep
serving while performing the updates? It seems that the part that causes the
slowdown is this transition.




Otis Gospodnetic wrote:
> 
> This is an old and long thread, and I no longer recall what the specific
> suggestions were.
> My guess is this has to do with the OS cache of your index files.  When
> you make the large index update, that OS cache is useless (old files are
> gone, new ones are in) and the OS cache has get re-warmed and this takes
> time.
> 
> Are you optimizing your index before the update?  Do you *really* need to
> do that?
> How large is your update, what makes it big, and could you make it
> smaller?
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: oleg_gnatovskiy 
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, January 20, 2009 6:19:46 PM
>> Subject: Re: Query Performance while updating teh index
>> 
>> 
>> Hello again. It seems that we are still having these problems. Queries
>> take
>> as long as 20 minutes to get back to their average response time after a
>> large index update, so it doesn't seem like the problem is the 12 second
>> autowarm time. Are there any more suggestions for things we can try?
>> Taking
>> our servers out of teh loop for as long as 20 minutes is a bit of a
>> hassle,
>> and a risk.
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21588779.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using Threading while Indexing.

2009-01-20 Thread oleg_gnatovskiy

I can verify that multithreaded loading using HTTP does work. That's probably
the way to go.



zayhen wrote:
> 
> Your 3 instances are trying to acquire the physical lock to the index.
> If you want to use multi-threaded indexing, I would suggest http
> interface,
> as Solr will control the request queue for you and index as much docs  as
> it
> can receive from your open threads (resource wise obviously).
> 
> 2009/1/19 Sagar Khetkade 
> 
>>
>> Hi,
>>
>> I was trying to index three sets of document having 2000 articles using
>> three threads of embedded solr server. But while indexing, giving me
>> exception "org.apache.lucene.store.LockObtainFailedException: Lock obtain
>> timed out: SingleInstanceLock: write.lock".  I know that this issue do
>> persists with Lucene; is it the same with Solr?
>>
>> Thanks and Regards,
>> Sagar Khetkade.
>> _
>> For the freshest Indian Jobs Visit MSN Jobs
>> http://www.in.msn.com/jobs
>>
> 
> 
> 
> -- 
> Alexander Ramos Jardim
> 
> 
> -
> RPG da Ilha 
> 

-- 
View this message in context: 
http://www.nabble.com/Using-Threading-while-Indexing.-tp21537667p21574047.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2009-01-20 Thread oleg_gnatovskiy

Hello again. It seems that we are still having these problems. Queries take
as long as 20 minutes to get back to their average response time after a
large index update, so it doesn't seem like the problem is the 12 second
autowarm time. Are there any more suggestions for things we can try? Taking
our servers out of teh loop for as long as 20 minutes is a bit of a hassle,
and a risk.
-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p21573927.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Query Performance while updating teh index

2008-12-12 Thread oleg_gnatovskiy

Should this autowarm value be set based on the number of lookups? From the
info I provided that like 60k.  filterCache{lookups=58522

Will 25k be enough?

Also, does that mean that we have to at least increase the size and initial
size as big as we set the autowarm?


Feak, Todd wrote:
> 
> Sorry, my bad. Didn't read the entire thread.
> 
> Look at your filter cache first. You are autowarming 1000, and there is
> exactly 1000 in there. Yet it looks like there may be tens of thousands
> of filter queries in your system. I would try autowarming more. Try
> 10,000 or 20,000 and see if it helps.
> 
> Second look at your document cache. Document caches don't use autowarm.
> But you can add queries to your firstSeacher and newSearcher entries in
> your solrconfig to pre-populate the document cache during warming.
> 
> -Todd Feak
> 
> 
> -Original Message-
> From: oleg_gnatovskiy [mailto:oleg_gnatovs...@citysearch.com] 
> Sent: Friday, December 12, 2008 11:19 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Query Performance while updating teh index
> 
> 
> The auto warm time is not an issue. We take the server off the load
> balancer
> while it is autowarming. It seems that the slowness occurs after
> autowarm is
> done.
> 
> 
> 
> Feak, Todd wrote:
>> 
>> It's spending 4-5 seconds warming up your query cache. If 4-5 seconds
> is
>> too much, you could reduce the number of queries to auto-warm with on
>> that cache.
>> 
>> Notice that the 4-5 seconds is spent only putting about 420 queries
> into
>> the query cache. Your autowarm of 5 for the query cache seems a
> bit
>> high. If you need to reduce that autowarm time below 5 seconds, you
> may
>> have to set that value in the hundreds, as opposed to tens of
> thousands.
>> 
>> -Todd Feak
>> 
>> -Original Message-
>> From: oleg_gnatovskiy [mailto:oleg_gnatovs...@citysearch.com] 
>> Sent: Friday, December 12, 2008 10:08 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Query Performance while updating teh index
>> 
>> 
>> Here's what we have on one of the data slaves for the autowarming.
>> 
>>  
>> 
>> --
>> 
>> Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm
>> 
>> INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main
>> 
>>
>>
> filterCache{lookups=351993,hits=347055,hitratio=0.98,inserts=8332,evicti
>>
> ons=0,size=8245,warmupTime=215,cumulative_lookups=2837676,cumulative_hit
>>
> s=2766551,cumulative_hitratio=0.97,cumulative_inserts=72050,cumulative_e
>> victions=0}
>> 
>> Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm
>> 
>> INFO: autowarming result for searc...@3f32ca2b main
>> 
>>
>>
> filterCache{lookups=0,hits=0,hitratio=0.00,inserts=1000,evictions=0,size
>>
> =1000,warmupTime=317,cumulative_lookups=2837676,cumulative_hits=2766551,
>>
> cumulative_hitratio=0.97,cumulative_inserts=72050,cumulative_evictions=0
>> }
>> 
>> Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm
>> 
>> INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main
>> 
>>
>>
> queryResultCache{lookups=5309,hits=5223,hitratio=0.98,inserts=422,evicti
>>
> ons=0,size=421,warmupTime=4628,cumulative_lookups=77802,cumulative_hits=
>>
> 77216,cumulative_hitratio=0.99,cumulative_inserts=424,cumulative_evictio
>> ns=0}
>> 
>> --
>> 
>> Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm
>> 
>> INFO: autowarming result for searc...@3f32ca2b main
>> 
>>
>>
> queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=421,evictions=0,
>>
> size=421,warmupTime=5536,cumulative_lookups=77804,cumulative_hits=77218,
>>
> cumulative_hitratio=0.99,cumulative_inserts=424,cumulative_evictions=0}
>> 
>> Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm
>> 
>> INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main
>> 
>>
>>
> documentCache{lookups=87216,hits=86686,hitratio=0.99,inserts=570,evictio
>>
> ns=0,size=570,warmupTime=0,cumulative_lookups=1270773,cumulative_hits=12
>>
> 68318,cumulative_hitratio=0.99,cumulative_inserts=2455,cumulative_evicti
>> ons=0}
>> 
>> Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm
>> 
>> INFO: autowarming result for searc...@3f32ca2b main
>> 
>>
>>
> documentCache{looku

RE: Query Performance while updating teh index

2008-12-12 Thread oleg_gnatovskiy

I just verified this. The slowness occurs after auto warm is done.

Oleg

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20982068.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Query Performance while updating teh index

2008-12-12 Thread oleg_gnatovskiy

The auto warm time is not an issue. We take the server off the load balancer
while it is autowarming. It seems that the slowness occurs after autowarm is
done.



Feak, Todd wrote:
> 
> It's spending 4-5 seconds warming up your query cache. If 4-5 seconds is
> too much, you could reduce the number of queries to auto-warm with on
> that cache.
> 
> Notice that the 4-5 seconds is spent only putting about 420 queries into
> the query cache. Your autowarm of 5 for the query cache seems a bit
> high. If you need to reduce that autowarm time below 5 seconds, you may
> have to set that value in the hundreds, as opposed to tens of thousands.
> 
> -Todd Feak
> 
> -Original Message-
> From: oleg_gnatovskiy [mailto:oleg_gnatovs...@citysearch.com] 
> Sent: Friday, December 12, 2008 10:08 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query Performance while updating teh index
> 
> 
> Here's what we have on one of the data slaves for the autowarming.
> 
>  
> 
> --
> 
> Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm
> 
> INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main
> 
>
> filterCache{lookups=351993,hits=347055,hitratio=0.98,inserts=8332,evicti
> ons=0,size=8245,warmupTime=215,cumulative_lookups=2837676,cumulative_hit
> s=2766551,cumulative_hitratio=0.97,cumulative_inserts=72050,cumulative_e
> victions=0}
> 
> Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm
> 
> INFO: autowarming result for searc...@3f32ca2b main
> 
>
> filterCache{lookups=0,hits=0,hitratio=0.00,inserts=1000,evictions=0,size
> =1000,warmupTime=317,cumulative_lookups=2837676,cumulative_hits=2766551,
> cumulative_hitratio=0.97,cumulative_inserts=72050,cumulative_evictions=0
> }
> 
> Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm
> 
> INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main
> 
>
> queryResultCache{lookups=5309,hits=5223,hitratio=0.98,inserts=422,evicti
> ons=0,size=421,warmupTime=4628,cumulative_lookups=77802,cumulative_hits=
> 77216,cumulative_hitratio=0.99,cumulative_inserts=424,cumulative_evictio
> ns=0}
> 
> --
> 
> Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm
> 
> INFO: autowarming result for searc...@3f32ca2b main
> 
>
> queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=421,evictions=0,
> size=421,warmupTime=5536,cumulative_lookups=77804,cumulative_hits=77218,
> cumulative_hitratio=0.99,cumulative_inserts=424,cumulative_evictions=0}
> 
> Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm
> 
> INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main
> 
>
> documentCache{lookups=87216,hits=86686,hitratio=0.99,inserts=570,evictio
> ns=0,size=570,warmupTime=0,cumulative_lookups=1270773,cumulative_hits=12
> 68318,cumulative_hitratio=0.99,cumulative_inserts=2455,cumulative_evicti
> ons=0}
> 
> Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm
> 
> INFO: autowarming result for searc...@3f32ca2b main
> 
>
> documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=
> 0,warmupTime=0,cumulative_lookups=1270773,cumulative_hits=1268318,cumula
> tive_hitratio=0.99,cumulative_inserts=2455,cumulative_evictions=0}
> 
> --
> 
>  
> 
> This is our current values after I've messed with them a few times
> trying to
> get better performance.
> 
>  
> 
>  
>   class="solr.LRUCache"
> 
>   size="3"
> 
>   initialSize="15000"
> 
>   autowarmCount="1000"/>
> 
>  
>   class="solr.LRUCache"
> 
>   size="6"
> 
>   initialSize="3"
> 
>   autowarmCount="5"/>
> 
>  
>   class="solr.LRUCache"
> 
>   size="20"
> 
>   initialSize="125000"
> 
>   autowarmCount="0"/>
> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452
> 835p20980669.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20981862.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2008-12-12 Thread oleg_gnatovskiy

Here’s what we have on one of the data slaves for the autowarming.

 

--

Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm

INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main

   
filterCache{lookups=351993,hits=347055,hitratio=0.98,inserts=8332,evictions=0,size=8245,warmupTime=215,cumulative_lookups=2837676,cumulative_hits=2766551,cumulative_hitratio=0.97,cumulative_inserts=72050,cumulative_evictions=0}

Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm

INFO: autowarming result for searc...@3f32ca2b main

   
filterCache{lookups=0,hits=0,hitratio=0.00,inserts=1000,evictions=0,size=1000,warmupTime=317,cumulative_lookups=2837676,cumulative_hits=2766551,cumulative_hitratio=0.97,cumulative_inserts=72050,cumulative_evictions=0}

Dec 12, 2008 8:46:02 AM org.apache.solr.search.SolrIndexSearcher warm

INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main

   
queryResultCache{lookups=5309,hits=5223,hitratio=0.98,inserts=422,evictions=0,size=421,warmupTime=4628,cumulative_lookups=77802,cumulative_hits=77216,cumulative_hitratio=0.99,cumulative_inserts=424,cumulative_evictions=0}

--

Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm

INFO: autowarming result for searc...@3f32ca2b main

   
queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=421,evictions=0,size=421,warmupTime=5536,cumulative_lookups=77804,cumulative_hits=77218,cumulative_hitratio=0.99,cumulative_inserts=424,cumulative_evictions=0}

Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm

INFO: autowarming searc...@3f32ca2b main from searc...@443ad545 main

   
documentCache{lookups=87216,hits=86686,hitratio=0.99,inserts=570,evictions=0,size=570,warmupTime=0,cumulative_lookups=1270773,cumulative_hits=1268318,cumulative_hitratio=0.99,cumulative_inserts=2455,cumulative_evictions=0}

Dec 12, 2008 8:46:07 AM org.apache.solr.search.SolrIndexSearcher warm

INFO: autowarming result for searc...@3f32ca2b main

   
documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=1270773,cumulative_hits=1268318,cumulative_hitratio=0.99,cumulative_inserts=2455,cumulative_evictions=0}

--

 

This is our current values after I’ve messed with them a few times trying to
get better performance.

 








-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20980669.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2008-12-12 Thread oleg_gnatovskiy

Hey Otis,

Do you think our problem is slow warm time, or too few items that are being
copied?

Oleg

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20980523.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Query Performance while updating teh index

2008-12-11 Thread oleg_gnatovskiy

We are still having this problem. I am wondering if it can be fixed with
autowarm settings. Is there a reliable formula for determining the autowarm
settings?
-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20968516.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2008-11-12 Thread oleg_gnatovskiy

Well we never had 1.2 deployed, so I don't know if it's a new issue or not...


Yonik Seeley wrote:
> 
> Warming only uses one CPU, so it shouldn't have that much of an impact
> on a multi-CPU box.
> 
> Did this issue begin with Solr 1.3?  Perhaps it has something to do
> with our use of reopen() (to share parts of the index that are not in
> use).  This can lead to greater lock contention while reading from the
> index.
> 
> If so, we need
> 1) an option to disable using IndexReader.reopen()  (I think Mark
> already has a patch for this)
> 2) NIO support to reduce/eliminate that contention on non Windows
> platforms (a work in progress - the last patch doesn't actually do it)
> 
> -Yonik
> 
> 
> On Wed, Nov 12, 2008 at 2:16 PM, Lance Norskog <[EMAIL PROTECTED]> wrote:
>> Yes, this is the cache autowarming.
>>
>> We turned this off and staged separate queries that pre-warm our standard
>> queries. We are looking at pulling the query server out of the load
>> balancer
>> during this process; it is the most effective way to give fixed response
>> time.
>>
>> Lance
>>
>> -Original Message-
>> From: oleg_gnatovskiy [mailto:[EMAIL PROTECTED]
>> Sent: Wednesday, November 12, 2008 11:07 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Query Performance while updating teh index
>>
>>
>> The rsync seems to have nothing to do with slowness, because while the
>> rsync
>> is going on, there isn't any reload occurring, once the files are on the
>> system, it tries a curl request to reload the searcher, which at that
>> point
>> causes the delays. The file transfer probably has nothing to do with
>> this.
>> Does this mean that it happens during warming?
>>
>>
>>
>> Yonik Seeley wrote:
>>>
>>> On Tue, Nov 11, 2008 at 9:31 PM, oleg_gnatovskiy
>>> <[EMAIL PROTECTED]> wrote:
>>>> Hello. We have an index with 15 million documents working on a
>>>> distributed environment, with an index distribution setup. While an
>>>> index on a slave server is being updated, query response times become
>>>> extremely slow (upwards of 5 seconds). Is there any way to decrease
>>>> the hit query response times take while an index is being pushed?
>>>
>>> Can you tell why it's getting slow?  Is this during warming, or does
>>> it begin during the actual transfer of the new index?
>>>
>>> One possibility is that the new index being copied forces out parts of
>>> the old index from the OS cache.  More memory would help in that
>>> scenario.
>>>
>>> -Yonik
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p
>> 20467099.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20469525.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2008-11-12 Thread oleg_gnatovskiy



Yonik Seeley wrote:
> 
> On Wed, Nov 12, 2008 at 2:06 PM, oleg_gnatovskiy
> <[EMAIL PROTECTED]> wrote:
>> The rsync seems to have nothing to do with slowness, because while the
>> rsync
>> is going on, there isn't any reload occurring, once the files are on the
>> system, it tries a curl request to reload the searcher, which at that
>> point
>> causes the delays. The file transfer probably has nothing to do with
>> this.
>> Does this mean that it happens during warming?
> 
> Yes, it would seem so.
> It could either be that 1) warming the new reader slows down the
> current reader used to service queries
> or 2) the first queries to come into the new reader are slow (which
> can be solved with some static warming queries to load sort fields,
> facet caches, etc).
> 
> How many CPUs does the box have that you are running on?  What OS?
> 
> -Yonik
> 
> 
> 
>> Yonik Seeley wrote:
>>>
>>> On Tue, Nov 11, 2008 at 9:31 PM, oleg_gnatovskiy
>>> <[EMAIL PROTECTED]> wrote:
>>>> Hello. We have an index with 15 million documents working on a
>>>> distributed
>>>> environment, with an index distribution setup. While an index on a
>>>> slave
>>>> server is being updated, query response times become extremely slow
>>>> (upwards
>>>> of 5 seconds). Is there any way to decrease the hit query response
>>>> times
>>>> take while an index is being pushed?
>>>
>>> Can you tell why it's getting slow?  Is this during warming, or does
>>> it begin during the actual transfer of the new index?
>>>
>>> One possibility is that the new index being copied forces out parts of
>>> the old index from the OS cache.  More memory would help in that
>>> scenario.
>>>
>>> -Yonik
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20467099.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

8 CPUs and Linux OS

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20467281.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Query Performance while updating teh index

2008-11-12 Thread oleg_gnatovskiy

The rsync seems to have nothing to do with slowness, because while the rsync
is going on, there isn’t any reload occurring, once the files are on the
system, it tries a curl request to reload the searcher, which at that point
causes the delays. The file transfer probably has nothing to do with this.
Does this mean that it happens during warming?



Yonik Seeley wrote:
> 
> On Tue, Nov 11, 2008 at 9:31 PM, oleg_gnatovskiy
> <[EMAIL PROTECTED]> wrote:
>> Hello. We have an index with 15 million documents working on a
>> distributed
>> environment, with an index distribution setup. While an index on a slave
>> server is being updated, query response times become extremely slow
>> (upwards
>> of 5 seconds). Is there any way to decrease the hit query response times
>> take while an index is being pushed?
> 
> Can you tell why it's getting slow?  Is this during warming, or does
> it begin during the actual transfer of the new index?
> 
> One possibility is that the new index being copied forces out parts of
> the old index from the OS cache.  More memory would help in that
> scenario.
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20467099.html
Sent from the Solr - User mailing list archive at Nabble.com.



Query Performance while updating teh index

2008-11-11 Thread oleg_gnatovskiy

Hello. We have an index with 15 million documents working on a distributed
environment, with an index distribution setup. While an index on a slave
server is being updated, query response times become extremely slow (upwards
of 5 seconds). Is there any way to decrease the hit query response times
take while an index is being pushed? 
Thanks!
Oleg
-- 
View this message in context: 
http://www.nabble.com/Query-Performance-while-updating-teh-index-tp20452835p20452835.html
Sent from the Solr - User mailing list archive at Nabble.com.



Different XML format for multi-valued fields?

2008-10-16 Thread oleg_gnatovskiy

Hello. I have an index built in Solr with several multi-value fields. When
the multi-value field has only one value for a document, the XML returned
looks like this: 

5693

However, when there are multiple values for the field, the XMl looks like
this: 
arr name="someIds">
11199
1722

Is there a reason for this difference? Also, how does faceting work with
multi-valued fields? It seems that I sometimes get facet results from
multi-valued fields, and sometimes I don't.

Thanks.
-- 
View this message in context: 
http://www.nabble.com/Different-XML-format-for-multi-valued-fields--tp20015951p20015951.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: File based index doesn't work in spellcheck component

2008-09-19 Thread oleg_gnatovskiy



oleg_gnatovskiy wrote:
> 
> Hello,
> 
> I tried to have the spellcheck component to write to a drive index. My
> config is a s follows:
> 
>  
>name="classname">org.apache.solr.spelling.FileBasedSpellChecker
>   external
>   spellings.txt
>   UTF-8
>   true
>   
>   ./spellIndex
>="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance
>
> 
> However the spell checker seems to still sue RAMDirectory. No spellIndex
> directory is created, and the index is lost every time the server is
> restarted.
> 


That works, thanks.
-- 
View this message in context: 
http://www.nabble.com/File-based-index-doesn%27t-work-in-spellcheck-component-tp19576916p19577276.html
Sent from the Solr - User mailing list archive at Nabble.com.



File based index doesn't work in spellcheck component

2008-09-19 Thread oleg_gnatovskiy

Hello,

I tried to have the spellcheck component to write to a drive index. My
config is a s follows:

 
  org.apache.solr.spelling.FileBasedSpellChecker
  external
  spellings.txt
  UTF-8
  true
  
  ./spellIndex
  org.apache.lucene.search.spell.JaroWinklerDistance
   

However the spell checker seems to still sue RAMDirectory. No spellIndex
directory is created, and the index is lost every time the server is
restarted.
-- 
View this message in context: 
http://www.nabble.com/File-based-index-doesn%27t-work-in-spellcheck-component-tp19576916p19576916.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: firstSearcher and newSearcher events

2008-09-19 Thread oleg_gnatovskiy

I created one. https://issues.apache.org/jira/browse/SOLR-780

By the way you pointed out that true
would solve the problem, but that doesn't make it rebuild on startup right?
This works at rebuilding the index with every update, which is different.


Shalin Shekhar Mangar wrote:
> 
> On Fri, Sep 19, 2008 at 10:07 PM, oleg_gnatovskiy <
> [EMAIL PROTECTED]> wrote:
> 
>>
>> Is there any way to do it for an external (file-based) dictionary?
>>
> 
> SpellCheckComponent always reload on the dictionary in the firstSearcher
> event. This works if you are using file system based index. However with
> RAMDirectory, reload does nothing. You need to explicitly call
> spellcheck.build on firstSearcher. The configuration snippet for
> firstSearcher you posted seems fine. I'll run a test locally to see if I
> can
> reproduce the problem.
> 
> Can you open a jira issue so that we can enhance SpellCheckComponent to
> automatically build on firstSearcher in case of RAMDirectory based
> indices?
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/firstSearcher-and-newSearcher-events-tp19564163p19576665.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: firstSearcher and newSearcher events

2008-09-19 Thread oleg_gnatovskiy



Shalin Shekhar Mangar wrote:
> 
> On Fri, Sep 19, 2008 at 5:55 AM, oleg_gnatovskiy <
> [EMAIL PROTECTED]> wrote:
> 
>>
>> Hello. I am using the spellcheck component
>> (https://issues.apache.org/jira/browse/SOLR-572). Since the spell checker
>> index is kept in RAM, it gets erased every time the Solr server gets
>> restarted. I was thinking of using either the firstSearcher or the
>> newSearcher to reload the index every time Solr starts.
> 
> 
> This capability is already in SpellCheckComponent:
> 
> http://wiki.apache.org/solr/SpellCheckComponent#onCommit
> 
> 
>>
>> --
>> View this message in context:
>> http://www.nabble.com/firstSearcher-and-newSearcher-events-tp19564163p19564163.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 


Is there any way to do it for an external (file-based) dictionary?
-- 
View this message in context: 
http://www.nabble.com/firstSearcher-and-newSearcher-events-tp19564163p19575665.html
Sent from the Solr - User mailing list archive at Nabble.com.



firstSearcher and newSearcher events

2008-09-18 Thread oleg_gnatovskiy

Hello. I am using the spellcheck component
(https://issues.apache.org/jira/browse/SOLR-572). Since the spell checker
index is kept in RAM, it gets erased every time the Solr server gets
restarted. I was thinking of using either the firstSearcher or the
newSearcher to reload the index every time Solr starts. The events are
defined as so: 




true
external
true
piza





−

−

fast_warm
0
10




static firstSearcher warming query from solrconfig.xml



true
external
true
piza




However the index does not load. When I check the logs I noticed the
following:
when the event runs the log looks like this:

INFO: [] webapp=null path=null
params={spellcheck=true&q=piza&spellcheck.dictionary=external&spellcheck.build=true}
hits=0 status=0 QTime=1

a regular request looks like this:

INFO: [] webapp=/solr path=/select/
params={spellcheck=true&q=piza&spellcheck.dictionary=external&spellcheck.build=true}
hits=0 status=0 QTime=19459

I am guessing that the reason it doesn't work with the autowarm is that the
webapp is null. Does anyone have any ideas what I can do to load that index
in advance?
-- 
View this message in context: 
http://www.nabble.com/firstSearcher-and-newSearcher-events-tp19564163p19564163.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Problem getting spelling suggestions to work

2008-05-19 Thread oleg_gnatovskiy

Thats true, but that's not the problem. The problem is that you can't call
qt=spellchecker if you redefine /select in solrconfig.xml. I was wondering
how I could add qt functionality back.



Otis Gospodnetic wrote:
> 
> I haven't actually used this in a while, but are you asking the handler
> for spellchecking (q=pizzza) or are you asking it to rebuild the index
> (cmd=rebuild)?  Asking for both at the same time might not be the best
> thing.
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> - Original Message 
>> From: oleg_gnatovskiy <[EMAIL PROTECTED]>
>> To: solr-user@lucene.apache.org
>> Sent: Monday, May 19, 2008 9:01:14 PM
>> Subject: Problem getting spelling suggestions to work
>> 
>> 
>> Hello. I am having some trouble getting spelling suggestions to work. I
>> am
>> running the latest nightly build of Solr. The URL I am hitting is:
>> 
>> http://localhost:8983/solr/select/?q=pizzza&qt=spellchecker&cmd=rebuild
>> 
>> and the response I am getting is 
>> 
>> 
>> 
>> 
>> 0
>> 14
>> 
>> 
>> rebuild
>> pizzza
>> spellchecker
>> 
>> 
>> 
>> 
>> 
>> Which is obviously missing the suggestions field. The reason for that is
>> likely that I overrode the default definition of /select. My /select is
>> defined in the following way:
>> 
>> 
>> class="org.apache.solr.handler.component.SearchHandler">
>> 
>>   explicit
>> 
>> 
>>   collapse
>>   facet
>>   mlt
>>   highlight
>>   debug
>> 
>>   
>> The reason I am doing this, is that I want to replace the query component
>> with the collapse component.
>> 
>> Am I missing something that would make the qt parameter work? Any help
>> would
>> be appreciated.
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Problem-getting-spelling-suggestions-to-work-tp17331252p17331252.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Problem-getting-spelling-suggestions-to-work-tp17331252p17333756.html
Sent from the Solr - User mailing list archive at Nabble.com.



Problem getting spelling suggestions to work

2008-05-19 Thread oleg_gnatovskiy

Hello. I am having some trouble getting spelling suggestions to work. I am
running the latest nightly build of Solr. The URL I am hitting is:

http://localhost:8983/solr/select/?q=pizzza&qt=spellchecker&cmd=rebuild

and the response I am getting is 




0
14


rebuild
pizzza
spellchecker





Which is obviously missing the suggestions field. The reason for that is
likely that I overrode the default definition of /select. My /select is
defined in the following way:

 

  explicit


  collapse
  facet
  mlt
  highlight
  debug

  
The reason I am doing this, is that I want to replace the query component
with the collapse component.

Am I missing something that would make the qt parameter work? Any help would
be appreciated.
-- 
View this message in context: 
http://www.nabble.com/Problem-getting-spelling-suggestions-to-work-tp17331252p17331252.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Field Grouping

2008-05-14 Thread oleg_gnatovskiy

Yes, that is the patch I am trying to get to work. It doesn't have a feature
for distributed search.


ryantxu wrote:
> 
> You may want to check "field collapsing"
> https://issues.apache.org/jira/browse/SOLR-236
> 
> There is a patch that works against 1.2, but the one for trunk needs  
> some work before it can work...
> 
> ryan
> 
> 
> On May 13, 2008, at 2:46 PM, oleg_gnatovskiy wrote:
>>
>> There is an XSLT example here:
>> http://wiki.apache.org/solr/XsltResponseWriter
>> , but it doesn't seem like that would work either... This example  
>> would only
>> do a group by for the current page. If I use Solr for pagination,  
>> this would
>> not work for me.
>>
>>
>> oleg_gnatovskiy wrote:
>>>
>>> But I don't want the search results to be ranked based on that  
>>> field. I
>>> only want all the documents with the same value grouped together...  
>>> The
>>> way my system is set up, most documents will have that field empty.  
>>> Thus,
>>> if Is rot by it, those documents that have a value will bubble to the
>>> top...
>>>
>>>
>>>
>>> Yonik Seeley wrote:
>>>>
>>>> On Mon, May 12, 2008 at 9:58 PM, oleg_gnatovskiy
>>>> <[EMAIL PROTECTED]> wrote:
>>>>> Hello. I was wondering if there is a way to get solr to return  
>>>>> fields
>>>>> with
>>>>> the same value for a particular field together. For example I might
>>>>> want to
>>>>> have all the documents with exactly the same name field all  
>>>>> returned
>>>>> next to
>>>>> each other. Is this possible? Thanks!
>>>>
>>>> Sort by that field.  Since you can only sort by fields with a single
>>>> term at most (this rules out full-text fields), you might want to  
>>>> do a
>>>> copyField of the "name" field to something like a "name_s" field  
>>>> which
>>>> is of type string (which can be sorted on).
>>>>
>>>> -Yonik
>>>>
>>>>
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Field-Grouping-tp17199592p17215641.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Field-Grouping-tp17199592p17244589.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Field Grouping

2008-05-13 Thread oleg_gnatovskiy

There is an XSLT example here: http://wiki.apache.org/solr/XsltResponseWriter
, but it doesn't seem like that would work either... This example would only
do a group by for the current page. If I use Solr for pagination, this would
not work for me.


oleg_gnatovskiy wrote:
> 
> But I don't want the search results to be ranked based on that field. I
> only want all the documents with the same value grouped together... The
> way my system is set up, most documents will have that field empty. Thus,
> if Is rot by it, those documents that have a value will bubble to the
> top...
> 
> 
> 
> Yonik Seeley wrote:
>> 
>> On Mon, May 12, 2008 at 9:58 PM, oleg_gnatovskiy
>> <[EMAIL PROTECTED]> wrote:
>>>  Hello. I was wondering if there is a way to get solr to return fields
>>> with
>>>  the same value for a particular field together. For example I might
>>> want to
>>>  have all the documents with exactly the same name field all returned
>>> next to
>>>  each other. Is this possible? Thanks!
>> 
>> Sort by that field.  Since you can only sort by fields with a single
>> term at most (this rules out full-text fields), you might want to do a
>> copyField of the "name" field to something like a "name_s" field which
>> is of type string (which can be sorted on).
>> 
>> -Yonik
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Field-Grouping-tp17199592p17215641.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Field Grouping

2008-05-12 Thread oleg_gnatovskiy

But I don't want the search results to be ranked based on that field. I only
want all the documents with the same value grouped together... The way my
system is set up, most documents will have that field empty. Thus, if Is rot
by it, those documents that have a value will bubble to the top...



Yonik Seeley wrote:
> 
> On Mon, May 12, 2008 at 9:58 PM, oleg_gnatovskiy
> <[EMAIL PROTECTED]> wrote:
>>  Hello. I was wondering if there is a way to get solr to return fields
>> with
>>  the same value for a particular field together. For example I might want
>> to
>>  have all the documents with exactly the same name field all returned
>> next to
>>  each other. Is this possible? Thanks!
> 
> Sort by that field.  Since you can only sort by fields with a single
> term at most (this rules out full-text fields), you might want to do a
> copyField of the "name" field to something like a "name_s" field which
> is of type string (which can be sorted on).
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Field-Grouping-tp17199592p17201424.html
Sent from the Solr - User mailing list archive at Nabble.com.



Field Grouping

2008-05-12 Thread oleg_gnatovskiy

Hello. I was wondering if there is a way to get solr to return fields with
the same value for a particular field together. For example I might want to
have all the documents with exactly the same name field all returned next to
each other. Is this possible? Thanks!
-- 
View this message in context: 
http://www.nabble.com/Field-Grouping-tp17199592p17199592.html
Sent from the Solr - User mailing list archive at Nabble.com.



MultiThreaded Document Loader?

2008-04-24 Thread oleg_gnatovskiy

Hello. I was wondering if Solr has some kind of a multi-threaded document
loader? I've been using post.sh (curl) to post documents to my Solr server,
and it's pretty slow. I know it should be pretty easy to write one up, but I
was just wondering if one already existed.
-- 
View this message in context: 
http://www.nabble.com/MultiThreaded-Document-Loader--tp16853440p16853440.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: too many queries?

2008-04-16 Thread oleg_gnatovskiy

Oh ok. That makes sense. Thanks.

Otis Gospodnetic wrote:
> 
> Oleg, you can't explicitly say "N GB for index".  Wunder was just saying
> how much you can imagine how much RAM each piece might need and be happy
> with.
>  
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> - Original Message 
> From: oleg_gnatovskiy <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, April 16, 2008 2:05:23 PM
> Subject: Re: too many queries?
> 
> 
> Hello. I am having a similar problem as the OP. I see that you recommended
> setting 4GB for the index, and 2 for Solr. How do I allocate memory for
> the
> index? I was under the impression that Solr did not support a RAMIndex.
> 
> 
> Walter Underwood wrote:
>> 
>> Do it. 32-bit OS's went out of style five years ago in server-land.
>> 
>> I would start with 8GB of RAM. 4GB for your index, 2 for Solr, 1 for
>> the OS and 1 for other processes. That might be tight. 12GB would
>> be a lot better.
>> 
>> wunder
>> 
>> On 4/16/08 7:50 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote:
>> 
>>> In order to do that I have to change to a 64 bits OS so I can have more
>>> than
>>> 4 GB of RAM.Is there any way to see how long does it takes to Solr to
>>> warmup
>>> the searcher?
>>> 
>>> On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood
>>> <[EMAIL PROTECTED]>
>>> wrote:
>>> 
>>>> A commit every two minutes means that the Solr caches are flushed
>>>> before they even start to stabilize. Two things to try:
>>>> 
>>>> * commit less often, 5 minutes or 10 minutes
>>>> * have enough RAM that your entire index can fit in OS file buffers
>>>> 
>>>> wunder
>>>> 
>>>> On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote:
>>>> 
>>>>> So I counted the number if distinct values that I have for each field
>>>> that I
>>>>> want a facet on. In total it's around 100,000. I tried with a
>>>> filterCache
>>>>> of 120,000 but it seems like too much because the server went down. I
>>>> will
>>>>> try with less, around 75,000 and let you know.
>>>>> 
>>>>> How do you to partition the data to a static set and a dynamic set,
>>>>> and
>>>> then
>>>>> combining them at query time? Do you have a link to read about that?
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]>
>>>> wrote:
>>>>> 
>>>>>> On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote:
>>>>>> 
>>>>>>> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is
>>>>>>> 32
>>>>>>> bits).
>>>>>>> It is optimized twice a day, it takes around 15 minutes to optimize.
>>>>>>> The index is updated (commits) every two minutes. There are between
>>>>>>> 10
>>>>>>> and
>>>>>>> 100 inserts/updates every 2 minutes.
>>>>>>> 
>>>>>> 
>>>>>> Caching could help--you should definitely start there.
>>>>>> 
>>>>>> The commit every 2 minutes could end up being an unsurmountable
>>>> problem.
>>>>>>  You may have to partition your data into a large, mostly static set
>>>> and a
>>>>>> small dynamic set, combining the results at query time.
>>>>>> 
>>>>>> -Mike
>>>>>> 
>>>> 
>>>> 
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/too-many-queries--tp16690870p16727264.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/too-many-queries--tp16690870p16732932.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: too many queries?

2008-04-16 Thread oleg_gnatovskiy

Hello. I am having a similar problem as the OP. I see that you recommended
setting 4GB for the index, and 2 for Solr. How do I allocate memory for the
index? I was under the impression that Solr did not support a RAMIndex.


Walter Underwood wrote:
> 
> Do it. 32-bit OS's went out of style five years ago in server-land.
> 
> I would start with 8GB of RAM. 4GB for your index, 2 for Solr, 1 for
> the OS and 1 for other processes. That might be tight. 12GB would
> be a lot better.
> 
> wunder
> 
> On 4/16/08 7:50 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote:
> 
>> In order to do that I have to change to a 64 bits OS so I can have more
>> than
>> 4 GB of RAM.Is there any way to see how long does it takes to Solr to
>> warmup
>> the searcher?
>> 
>> On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood
>> <[EMAIL PROTECTED]>
>> wrote:
>> 
>>> A commit every two minutes means that the Solr caches are flushed
>>> before they even start to stabilize. Two things to try:
>>> 
>>> * commit less often, 5 minutes or 10 minutes
>>> * have enough RAM that your entire index can fit in OS file buffers
>>> 
>>> wunder
>>> 
>>> On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote:
>>> 
 So I counted the number if distinct values that I have for each field
>>> that I
 want a facet on. In total it's around 100,000. I tried with a
>>> filterCache
 of 120,000 but it seems like too much because the server went down. I
>>> will
 try with less, around 75,000 and let you know.
 
 How do you to partition the data to a static set and a dynamic set, and
>>> then
 combining them at query time? Do you have a link to read about that?
 
 
 
 On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]>
>>> wrote:
 
> On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote:
> 
>> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is
>> 32
>> bits).
>> It is optimized twice a day, it takes around 15 minutes to optimize.
>> The index is updated (commits) every two minutes. There are between
>> 10
>> and
>> 100 inserts/updates every 2 minutes.
>> 
> 
> Caching could help--you should definitely start there.
> 
> The commit every 2 minutes could end up being an unsurmountable
>>> problem.
>  You may have to partition your data into a large, mostly static set
>>> and a
> small dynamic set, combining the results at query time.
> 
> -Mike
> 
>>> 
>>> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/too-many-queries--tp16690870p16727264.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Distributed Search

2008-04-09 Thread oleg_gnatovskiy



Yonik Seeley wrote:
> 
> On Wed, Apr 9, 2008 at 1:57 PM, oleg_gnatovskiy
> <[EMAIL PROTECTED]> wrote:
>>  Do you have any suggestions as to how we would be able to implement
>> chain
>>  collapse over the entire distributed index? Our collection is 27 GB, 15
>>  million documents. Do you think there is a way to optimize Solr
>> performance
>>  enough to not have to segment such a large collection?
> 
> What is the current performance bottleneck that is causing you to have
> to segment in the first place?
> 15M docs is often doable on a single box I think, but it depends
> heavily on what the queries are, what faceting is done, etc.
> 
> -onik
> 
> 

Well we are running some really heavy faceting, and searching up to 15
fields at a time for each query. The bottleneck was that a single query
either took 15 minutes, or died with a heap space error...

-- 
View this message in context: 
http://www.nabble.com/Distributed-Search-tp16577204p16595616.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Distributed Search

2008-04-09 Thread oleg_gnatovskiy

Do you have any suggestions as to how we would be able to implement chain
collapse over the entire distributed index? Our collection is 27 GB, 15
million documents. Do you think there is a way to optimize Solr performance
enough to not have to segment such a large collection?


Yonik Seeley wrote:
> 
> On Wed, Apr 9, 2008 at 2:00 AM, oleg_gnatovskiy
> <[EMAIL PROTECTED]> wrote:
>>  We are using the Chain Collapse patch as well. Will that not work over a
>>  distributed index?
> 
> Since there is no explicit distributed support for it, it would only
> collapse per-shard.
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Distributed-Search-tp16577204p16592826.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Nightly build compile error?

2008-04-09 Thread oleg_gnatovskiy



hossman wrote:
> 
> : 
> : Hello everyone. I downloaded the latest nightly build from
> : http://people.apache.org/builds/lucene/solr/nightly/. When I tried to
> : compile it, I got the following errors:
> : 
> : [javac] Compiling 189 source files to
> : /home/csweb/apache-solr-nightly/build/core
> : [javac]
> :
> /home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/admin/MultiCoreHandler.java:93:
> : cannot find symbol
> : [javac] symbol  : variable CREATE
> 
> I'm not sure how you managed to get that far ... because of some 
> refactoring that was done a little while back, the nightly builds don't 
> currently include all of the source, see SOLR-510.
> 
> The nightly builds do however already contain all the pre-built jars (and 
> war) that you need to run Solr ... if you want to compile from source, I 
> would just check out from subversion.
> 
> 
> 
> -Hoss
> 
> 
> 
Yup, that works.
-- 
View this message in context: 
http://www.nabble.com/Nightly-build-compile-error--tp16577739p16592725.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Distributed Search

2008-04-08 Thread oleg_gnatovskiy

We are using the Chain Collapse patch as well. Will that not work over a
distributed index?

swarag wrote:
> 
> Hi,
> I am trying to search through a distributed index and when I enter this
> link:
> 
> http://wil1devsch1.cs.tmcs:8983/select?shards=wil1devsch1.cs.tmcs:8983,wil1devsch1.cs.tmcs:8080&q=pizza
> 
> 
> But it always gives me results from the index stored on 8983 and not on
> 8080.
> Is there anything wrong in what I am doing???
> 

-- 
View this message in context: 
http://www.nabble.com/Distributed-Search-tp16577204p16580104.html
Sent from the Solr - User mailing list archive at Nabble.com.



Nightly build compile error?

2008-04-08 Thread oleg_gnatovskiy

Hello everyone. I downloaded the latest nightly build from
http://people.apache.org/builds/lucene/solr/nightly/. When I tried to
compile it, I got the following errors:

[javac] Compiling 189 source files to
/home/csweb/apache-solr-nightly/build/core
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/admin/MultiCoreHandler.java:93:
cannot find symbol
[javac] symbol  : variable CREATE
[javac] location: class
org.apache.solr.common.params.MultiCoreParams.MultiCoreAction
[javac] if (action == MultiCoreAction.CREATE) {
[javac]  ^
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/admin/MultiCoreHandler.java:95:
cannot find symbol
[javac] symbol  : variable NAME
[javac] location: interface
org.apache.solr.common.params.MultiCoreParams
[javac]   dcore.init(params.get(MultiCoreParams.NAME),
[javac]^
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/admin/MultiCoreHandler.java:96:
cannot find symbol
[javac] symbol  : variable INSTANCE_DIR
[javac] location: interface
org.apache.solr.common.params.MultiCoreParams
[javac] params.get(MultiCoreParams.INSTANCE_DIR));
[javac]   ^
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/admin/MultiCoreHandler.java:99:
cannot find symbol
[javac] symbol  : variable CONFIG
[javac] location: interface
org.apache.solr.common.params.MultiCoreParams
[javac]   String opts = params.get(MultiCoreParams.CONFIG);
[javac]   ^
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/admin/MultiCoreHandler.java:103:
cannot find symbol
[javac] symbol  : variable SCHEMA
[javac] location: interface
org.apache.solr.common.params.MultiCoreParams
[javac]   opts = params.get(MultiCoreParams.SCHEMA);
[javac]^
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/admin/MultiCoreHandler.java:164:
unqualified enumeration constant name required
[javac]   case PERSIST: {
[javac]^
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/component/QueryComponent.java:356:
cannot find symbol
[javac] symbol  : class ShardFieldSortedHitQueue
[javac] location: class org.apache.solr.handler.component.QueryComponent
[javac]   ShardFieldSortedHitQueue queue = new
ShardFieldSortedHitQueue(sortFields, ss.getOffset() + ss.getCount());
[javac]   ^
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/component/QueryComponent.java:356:
cannot find symbol
[javac] symbol  : class ShardFieldSortedHitQueue
[javac] location: class org.apache.solr.handler.component.QueryComponent
[javac]   ShardFieldSortedHitQueue queue = new
ShardFieldSortedHitQueue(sortFields, ss.getOffset() + ss.getCount());
[javac]^
[javac]
/home/csweb/apache-solr-nightly/src/java/org/apache/solr/handler/component/QueryComponent.java:491:
cannot find symbol
[javac] symbol  : method
join(java.util.ArrayList,char)
[javac] location: class org.apache.solr.common.util.StrUtils
[javac]   sreq.params.add("ids", StrUtils.join(ids, ','));
[javac]  ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 9 errors

Am I doing something wrong?
-- 
View this message in context: 
http://www.nabble.com/Nightly-build-compile-error--tp16577739p16577739.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: solr commit command questions

2008-04-04 Thread oleg_gnatovskiy

So, what is the point of the commit?

oleg_gnatovskiy wrote:
> 
> Hello. I was wondering what happens when an add command is done without a
> commit command. Is there any way to roll back?
> 

-- 
View this message in context: 
http://www.nabble.com/solr-commit-command-questions-tp16467824p16504441.html
Sent from the Solr - User mailing list archive at Nabble.com.



solr commit command questions

2008-04-03 Thread oleg_gnatovskiy

Hello. I was wondering what happens when an add command is done without a
commit command. Is there any way to roll back?
-- 
View this message in context: 
http://www.nabble.com/solr-commit-command-questions-tp16467824p16467824.html
Sent from the Solr - User mailing list archive at Nabble.com.



Replication of Segmented indexes

2008-03-26 Thread oleg_gnatovskiy

Hello, this is actually a repost of a question posed by Swarag. I don't think
he made the question quite clear, so let me give it a shot. It is known that
Solr has support for index replication, and it has support for index
segmentation. The question is, how would you use the replication tools with
a segmented index?
-- 
View this message in context: 
http://www.nabble.com/Replication-of-Segmented-indexes-tp16303343p16303343.html
Sent from the Solr - User mailing list archive at Nabble.com.



Query Level Boosting

2008-03-11 Thread oleg_gnatovskiy

Hello. I was wondering if anyone knew a way to do query level boosting with
SolrJ. On the http client I could just do something like sku:123^2.3 which
would boost the sky query 2.3 points.
-- 
View this message in context: 
http://www.nabble.com/Query-Level-Boosting-tp15995005p15995005.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr-J problem

2008-03-10 Thread oleg_gnatovskiy

Ah, so that is what that setRows method does. Would there be a problem with
setting it to Integer.MAX_VALUE since I don't know how many results I will
have in advance?




Otis Gospodnetic wrote:
> 
> I don't have the sources in front of me, but isn't there a setRows(int)
> method that you can call before running the query?
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> - Original Message 
> From: oleg_gnatovskiy <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Friday, March 7, 2008 9:07:18 PM
> Subject: Solr-J problem
> 
> 
> Hello. I just started using solrJ recently and ran into a problem. I
> execute
> the following line after creating a SolrQuery: SolrDocumentList
> solrResults
> = engine.query(solrQuery).getResults();. solrResults.size() is always 10,
> while solrResults.getNumFound() varies based on the query. My question is,
> how do I get access to the entire result set? Why do I only get a list of
> the first 10? Any help would be greatly appreciated.
> -- 
> View this message in context:
> http://www.nabble.com/Solr-J-problem-tp15910308p15910308.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-J-problem-tp15910308p15972409.html
Sent from the Solr - User mailing list archive at Nabble.com.



Solr-J problem

2008-03-07 Thread oleg_gnatovskiy

Hello. I just started using solrJ recently and ran into a problem. I execute
the following line after creating a SolrQuery: SolrDocumentList solrResults
= engine.query(solrQuery).getResults();. solrResults.size() is always 10,
while solrResults.getNumFound() varies based on the query. My question is,
how do I get access to the entire result set? Why do I only get a list of
the first 10? Any help would be greatly appreciated.
-- 
View this message in context: 
http://www.nabble.com/Solr-J-problem-tp15910308p15910308.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Question regarding Solr ranking

2008-02-29 Thread oleg_gnatovskiy


Otis Gospodnetic wrote:
> 
> It's a little hard to read that message, but if I were you I'd go to the
> Solr admin page, analysis section, enter your query, and see what index
> and query time analyzers spit out.  I think that should at least give you
> some hints.
> 
> Otis 
> 
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
I am not really clear to what the analysis mode is supposed to give me. It
requires me to specify a field when I specify a query. What does that do?
Also, I don't see anything in the analyzer to explain the weighting of a
particular document.

Regardless, what I have it narrowed down to is that my locRvwText (defined
as multiple value text field) and it has a field that looks like this:
"Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> > Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> > Pizza... Pizza... Pizza... Pizza... Pizza... ". Solr is counting this as
> 20 hits, but I was under the impression that the
> RemoveDuplicatesTokenFilterFactory should filter this result to have it
> count as just 1 hit. Am I understanding was
> RemoveDuplicatesTokenFilterFactory does incorrectly?
-- 
View this message in context: 
http://www.nabble.com/Question-regarding-Solr-ranking-tp15719752p15768743.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Companies Using Solr

2008-02-27 Thread oleg_gnatovskiy



Clay Webster wrote:
> 
> Hey Folks,
> 
> Reminder: http://wiki.apache.org/solr/PublicServers lists the sites using
> Solr.  The listing is a bit thin.  I know many people don't know about the
> list or don't have the time to add themselves to the list.  I'd like to be
> able to promote open sourcing more systems (like Solr) and this
> information
> would help show it is helping a large community.
> 
> Feel free to reply directly to me and I can add you.
> 
> Thanks.
> 
> --cw
> 
> Clay Webster
> Associate VP, Platform Infrastructure
> CNET, Inc. (Nasdaq:CNET)
> 
> 

How would you add to that list anyway? It's immutable.
-- 
View this message in context: 
http://www.nabble.com/Companies-Using-Solr-tp15617981p15721759.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Question regarding Solr ranking

2008-02-27 Thread oleg_gnatovskiy

Sorry about the previous message, I had some formatting issues. Below is the
actual message!

oleg_gnatovskiy wrote:
> 
> Hello everyone.
> 
> I've run into a weird problem with Solr's ranking engine. In a nutshell,
> the problem involves certain results getting EXTREMELY high rank scores.
> Here is an example:
> 
> locRvwText:"Pizza Pizza"^10 OR locName:"Pizza Pizza"^30
> 
> The way I understand it is that the locName part of the query should be
> boosted 3x more then the locRvwText.
> However, when running this query the first result is:
> 
> 10.8226
> Johnnie's New York Pizzeria
> 
> 
> Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> Pizza... Pizza... Pizza... Pizza... Pizza...
> 
> 
> 
> 
>   
> 
> 10.8226 = (MATCH) product of:
>   21.6452 = (MATCH) sum of:
> 21.6452 = weight(locRvwText:"pizza pizza"^10.0 in 3792465), product
> of:
>   0.3354544 = queryWeight(locRvwText:"pizza pizza"^10.0), product of:
> 10.0 = boost
> 14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
> 0.0023249863 = queryNorm
>   64.52502 = fieldWeight(locRvwText:"pizza pizza" in 3792465), product
> of:
> 4.472136 = tf(phraseFreq=20.0)
> 14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
> 1.0 = fieldNorm(field=locRvwText, doc=3792465)
>   0.5 = coord(1/2)
> 
> 
> 
> 
> How come the phrase frequency for rvwText comes back as 20? The field
> rvwText is defined in the following way:
> 
>  required="false" multiValued="true"  omitNorms="true"/>
> 
> And my text fields are defined in the following way:
> 
> 
>   
> 
>   
>  ignoreCase="true" expand="true"/>
>  words="stopwords.txt"/>
>  generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0"/>
> 
>  protected="protwords.txt"/>
> 
>   
>   
> 
>  ignoreCase="true" expand="true"/>
>  words="stopwords.txt"/>
>  generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0"/>
> 
>  protected="protwords.txt"/>
> 
>   
> 
> 
> Forgive me if I am wrong, but shouldn't the
> RemoveDuplicatesTokenFilterFactory have the string "Pizza... Pizza...
> Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
> Pizza... Pizza... Pizza..." Count as simplu one Pizza?
> I'd appreciate any help I can get! 
> 
> Thanks!
> 

-- 
View this message in context: 
http://www.nabble.com/Question-regarding-Solr-ranking-tp15719752p15719834.html
Sent from the Solr - User mailing list archive at Nabble.com.



Question regarding Solr ranking

2008-02-27 Thread oleg_gnatovskiy

Hello everyone.


I've run into a weird problem with Solr's ranking engine. In a nutshell, the
problem involves certain results getting EXTREMELY high rank scores. Here is
an example:


locRvwText:"Pizza Pizza"^10 OR locName:"Pizza Pizza"^30


The way I understand it is that the locName part of the query should be
boosted 3x more then the locRvwText.

However, when running this query the first result is:



10.8226
Johnnie's New York Pizzeria


Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
Pizza... Pizza... Pizza... Pizza... Pizza...



−


10.8226 = (MATCH) product of:
  21.6452 = (MATCH) sum of:
21.6452 = weight(locRvwText:"pizza pizza"^10.0 in 3792465), product of:
  0.3354544 = queryWeight(locRvwText:"pizza pizza"^10.0), product of:
10.0 = boost
14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
0.0023249863 = queryNorm
  64.52502 = fieldWeight(locRvwText:"pizza pizza" in 3792465), product
of:
4.472136 = tf(phraseFreq=20.0)
14.428232 = idf(locRvwText: pizza=8156 pizza=8156)
1.0 = fieldNorm(field=locRvwText, doc=3792465)
  0.5 = coord(1/2)




How come the phrase frequency for rvwText comes back as 20? The field
rvwText is defined in the following way:




And my text fields are defined in the following way:




  








  
  







  



Forgive me if I am wrong, but shouldn't the
RemoveDuplicatesTokenFilterFactory have the string "Pizza... Pizza...
Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza... Pizza...
Pizza... Pizza... Pizza..." Count as simplu one Pizza?

I'd appreciate any help I can get! 

Thanks!






-- 
View this message in context: 
http://www.nabble.com/Question-regarding-Solr-ranking-tp15719752p15719752.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Integrated Spellchecking

2008-02-15 Thread oleg_gnatovskiy



dsteiger wrote:
> 
> I've got a couple search components for automatic spell correction that
> I've been working on.
> 
> I've converted most of the SpellCheckerRequestHandler to a search
> component (hopefully will throw a 
> patch out soon for this).  Then another search component that will do auto
> correction for a query if 
> the search returns zero results.
> 
> We're hoping to see some performance improvements out of handling this in
> Solr instead of our Rails 
> service.
> 
> doug
> 
> 
> Ryan McKinley wrote:
>> Yes -- this is what search components are for!
>> 
>> Depending on where you put it in the chain, it could only return spell 
>> checked results if there are too few results (or the top score is below 
>> some threshold)
>> 
>> ryan
>> 
>> 
>> Grant Ingersoll wrote:
>>> Is it feasible to submit a query to any of the various handlers and 
>>> have it bring back results and spelling suggestions all in one 
>>> response?  Is this something the query components piece would handle, 
>>> assuming one exists for the spell checker?
>>>
>>> Thanks,
>>> Grant
>>>
> 
> 


So have you succeeded in implementing this patch? I'd definitely like to use
this functionality as a search suggestion.
-- 
View this message in context: 
http://www.nabble.com/Integrated-Spellchecking-tp14930232p15504125.html
Sent from the Solr - User mailing list archive at Nabble.com.