Otis,

That's an interesting suggestion.  I'm curious about the thought
process behind it though - we currently don't have memory problems,
and in fact our max memory setting is below where it could be.

Does your suggestion imply that something could be gained by throwing
more memory at the problem?  If so, could you explain a little bit
about why?

Regards,
Steve

On Sat, Mar 28, 2009 at 6:31 PM, Otis Gospodnetic
<otis_gospodne...@yahoo.com> wrote:
>
> OK, how about this trick then.  Do you really need the full string for 
> sorting?  Could you get by (cheat) sorting only on the first N characters?  
> If so, you could create a separate field for that (copyField will come handy) 
> and that should consume a little less memory.
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Steve Conover <scono...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Saturday, March 28, 2009 1:13:04 AM
>> Subject: Re: optimization advice?
>>
>> String ;-) - we only allow sorting on string fields.
>>
>> On Fri, Mar 27, 2009 at 9:21 PM, Otis Gospodnetic
>> wrote:
>> >
>> > Steve,
>> >
>> > A field named "name" sounds like a free text field.  What is its type, 
>> > string
>> or text?  Fields you sort by should not be tokenized and should be indexed.  
>> I
>> have a hunch your name field is tokenized.
>> >
>> >
>> > Otis
>> > --
>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> >
>> >
>> >
>> > ----- Original Message ----
>> >> From: Steve Conover
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Friday, March 27, 2009 11:59:52 PM
>> >> Subject: Re: optimization advice?
>> >>
>> >> We sort by default on "name", which varies quite a bit (we're never
>> >> going to make sorting by field go away).
>> >>
>> >> The thing is solr has been pretty amazing across 1 million records.
>> >> Now that we've doubled the size of the dataset things are definitely
>> >> slower in a nonlinear way...I'm wondering what factors are involved
>> >> here.
>> >>
>> >> -Steve
>> >>
>> >> On Fri, Mar 27, 2009 at 6:58 PM, Otis Gospodnetic
>> >> wrote:
>> >> >
>> >> > OK, we are a step closer.  Sorting makes things slower.  What field(s) 
>> >> > do
>> you
>> >> sort on, what are their types, and if there is a date in there, are the 
>> >> dates
>> >> very granular, and if they are, do you really need them to be that 
>> >> precise?
>> >> >
>> >> >
>> >> > Otis
>> >> > --
>> >> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> >> >
>> >> >
>> >> >
>> >> > ----- Original Message ----
>> >> >> From: Steve Conover
>> >> >> To: solr-user@lucene.apache.org
>> >> >> Sent: Friday, March 27, 2009 1:51:14 PM
>> >> >> Subject: Re: optimization advice?
>> >> >>
>> >> >> > Steve,
>> >> >> >
>> >> >> > Maybe you can tell us about:
>> >> >>
>> >> >> sure
>> >> >>
>> >> >> > - your hardware
>> >> >>
>> >> >> 2.5GB RAM, pretty modern virtual servers
>> >> >>
>> >> >> > - query rate
>> >> >>
>> >> >> Let's say a few queries per second max... < 4
>> >> >>
>> >> >> And in general the challenge is to get latency on any given query down
>> >> >> to something very low - we don't have to worry about a huge amount of
>> >> >> load at the moment.
>> >> >>
>> >> >> > - document cache and query cache settings
>> >> >>
>> >> >>
>> >> >>         class="solr.LRUCache"
>> >> >>         size="512"
>> >> >>         initialSize="512"
>> >> >>         autowarmCount="256"/>
>> >> >>
>> >> >>
>> >> >>         class="solr.LRUCache"
>> >> >>         size="512"
>> >> >>         initialSize="512"
>> >> >>         autowarmCount="0"/>
>> >> >>
>> >> >> > - your current response times
>> >> >>
>> >> >> This depends on the query.  For queries that involve a total record
>> >> >> count of < 1 million, we often see < 10ms response times, up to
>> >> >> 4-500ms in the worst case.  When we do a page one, sorted query on our
>> >> >> full record set of 2 million+ records, response times can get up into
>> >> >> 2+ seconds.
>> >> >>
>> >> >> > - any pain points, any slow query patterns
>> >> >>
>> >> >> Something that can't be emphasized enough is that we can't predict
>> >> >> what records people will want.  Almost every query is aimed at a
>> >> >> different set of records.
>> >> >>
>> >> >> -Steve
>> >> >
>> >> >
>> >
>> >
>
>

Reply via email to