I sure can't reproduce this on an 11M document Wikipedia dump.
I added the "text" from the Wiki dump 49 extra times (i.e. there
are 50 copies of the text field in each document), and pulled
back 12000 documents from my test machine (a Macbook Pro
from 3 years ago). I also debugged the code a bit and it *looks*
like it's doing the right thing, this on the Feb 2 build. The point
of adding the text lots of times was just to make the seeking
from disk take longer assuming that was the problem.

Although I do admit that the response time with lazy enabled/disabled
aren't all that different (4.7 and 6.8 respectively), even with rebooting
between setting lazy field loading on and off just to insure I wasn't
hitting the OS cache.

Although one thing to note: if you do NOT specify any field
list (fl parameter), the whole document is loaded regardless
of whether lazy is enabled or not, here's the fragment from
SolrIndexSearcher (around line 516 in the Feb 2 code):

    if(!enableLazyFieldLoading || fields == null) {
      d = getIndexReader().document(i);
    } else {
      final SetNonLazyFieldSelector visitor = new
SetNonLazyFieldSelector(fields, getIndexReader(), i);
      getIndexReader().document(i, visitor);
      d = visitor.doc;
    }

The name "SetNonLazyFieldSelector" is kind of interesting here,
but it does the right thing as far as I can tell.
The code goes there just as expected with lazy field loading on/off.

So I really haven't a clue. Despite your report that changing
lazy field loading had such a drastic effect, I'm starting to think
it's a red herring.....

I'd like to see the results of &debugQuery=on for the
slow and fast versions. Particularly down near the bottom
there's a section that looks something like this:
<double name="time">4082.0</double>
<lst name="prepare">
<double name="time">1.0</double>
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">1.0</double>
</lst>
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
(pardon the indentation). that shows the time spent in each component.
Sometimes this can be surprising.

Yonik:
Would you hypothesize that lazy field loading could be that much
slower if a large fraction of fields were selected? Let's claim that
there were 100 fields and you specified 90 of them be loaded
and *had* enabled lazy field loading. I've got to assume that
there's extra work to load each field individually so might it
actually be more expensive to enable lazy field loading in that case?
Pure speculation on my part....

Best
Erick

On Fri, Feb 24, 2012 at 2:47 PM, naptowndev <naptowndev...@gmail.com> wrote:
> Obviously it'd be great if someone else was able to confirm this in their
> setup as well.
>
> But with different environments, payload sizes, etc., I'm not sure how
> easily it can be tested in other environments.
>
> On Fri, Feb 24, 2012 at 2:46 PM, Brian G <naptowndev...@gmail.com> wrote:
>
>> Erick -
>>
>> That is exactly what we are seeing.
>>
>> this is in our solrconfig.xml:
>> <enableLazyFieldLoading>false</enableLazyFieldLoading>
>>
>> and our response times have decreased drastically.  I'm on my 40th-ish
>> test today and the response times are still 10+ seconds faster on the
>> higher payload than they were when it was set to true.
>>
>> Smaller payloads are also about 2.5 seconds faster.
>>
>>
>> On Fri, Feb 24, 2012 at 1:38 PM, Erick Erickson [via Lucene] <
>> ml-node+s472066n377336...@n3.nabble.com> wrote:
>>
>>> Let me echo this back to see if I have it right, because it's *extremely*
>>> weird if I'm reading it correctly.
>>>
>>> In your solrconfig.xml file, you changed this line:
>>> <enableLazyFieldLoading>true</enableLazyFieldLoading>
>>> to this:
>>> <enableLazyFieldLoading>false</enableLazyFieldLoading>
>>>
>>> and your response time DECREASED? If you can confirm that
>>> I'm reading it right, I'll open up a JIRA.
>>>
>>> Best
>>> Erick
>>>
>>> On Fri, Feb 24, 2012 at 1:14 PM, naptowndev <[hidden 
>>> email]<http://user/SendEmail.jtp?type=node&node=3773362&i=0>>
>>> wrote:
>>>
>>> > I'm not sure what would constitute a low vs. high hit rate (and
>>> eviction
>>> > rate), so we've kept the setting at LRUCache instead of FastCache for
>>> now.
>>> >
>>> > But I will say we did turn the LazyFieldLoading option off and wow - a
>>> huge
>>> > increase in performance on the newer nightly build we are using (the
>>> one
>>> > from Feb 2, 2012).
>>> >
>>> > The payload of 13.7 MB that was taking from anywhere around 15-17
>>> seconds
>>> > (with fastvectorhighlighter on) and 33+ seconds with FVH off is now
>>> taking
>>> > just about 3.2 seconds with FVH on.
>>> >
>>> > When we implement the wildcards for the fieldlist, thereby reducing the
>>> > payload down to 1.9MB, our average return time is around 875ms, down
>>> from
>>> > anywhere around 6-8 seconds before.
>>> >
>>> > Granted, I've only run about 20 tests (manually) at this point, so I'm
>>> going
>>> > to keep hitting at the server for a while with different queries to see
>>> if
>>> > anything gives, but at least at this point, it does appear setting the
>>> > lazyfieldloading to false has improved performance.
>>> >
>>> > It'd be ideal to figure out why that's the case, but that's a little
>>> beyond
>>> > my skill set at the moment.
>>> >
>>> > I'll let you guys know how results look as I proceed throughout the
>>> day.
>>> > (I've yet to run these tests against the 2010 build we were comparing
>>> > against - so I need to do that too)
>>> >
>>> > Please also let me know if you have any further suggestions.
>>> >
>>> > Thanks!
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773310.html
>>>
>>> > Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>> ------------------------------
>>>  If you reply to this email, your message will be added to the
>>> discussion below:
>>>
>>> http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773362.html
>>>  To unsubscribe from Solr Performance Improvement and degradation Help, 
>>> click
>>> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3767015&code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng==>
>>> .
>>> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>
>>
>>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773540.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to