Thanks again.

We're trying to get our ops team to install jconsole for us so we can take
a look at the GC stuff.

Your comment about the documentcache is intriguing for sure.

We just ran a couple of test against the older 4.x build we have (that's
been returning quicker) and the newer in that we just bring back the top 20
records with the highest largest 'payload', then reverse it to bring back
the top 20 with the lowest payload.  The highest payload was 16.3MB.  The
previous version of solr had a qtime of 507 ms (in it's initial run) and
the overall time to client was 2.6 seconds.  The newer build we have with
the same payload had a qtime of 593ms but the overall time to receive was
28.2 seconds.

When we did the reverse (small payload) of 37K, both instances had a qtime
measure at 1ms but the older version returned in 126 ms and the new version
returned in 192 ms.

I reran the tests (realizing some things may have been warmed or cached).
The large payload test yielded nearly exactly the same results where as the
small payload showed the older version returning in 11ms and the newer
version returning in 6ms.

I was told that solr was re-indexed as you mentioned since things changed.

So I'm going to see what I can learn about the documentCache because
perhaps there is something different between versions.

Our current config for that is as follows:
<documentCache class="*solr.LRUCache*" size="*15000*"
initialSize="*15000*"autowarmCount
="*0*" />

It's the same for both instances

And lazyfieldloading is enabled for both instances

Thanks,
Brian

On Thu, Feb 23, 2012 at 9:51 PM, Erick Erickson [via Lucene] <
ml-node+s472066n377156...@n3.nabble.com> wrote:

> It's still worth looking at the GC characteristics, there's a possibility
> that the newer build uses memory such that you're tripping over some
> threshold, but that's grasping at straws. I'd at least hook up jConsole
> for a sanity check...
>
> But if your QTimes are fast, the next thing that comes to mind is that
> you're spending (for some reason I can't name) more time gathering
> your fields off disk. Which, with 1,200 records is a possibility. Again,
> the "why" is a mystery. But you can do some triage by returning
> just a few fields to see if that's the issue.
>
> Wild stab: Did you re-index the data for your new version of Solr?
> The index format changed not too long ago, so it's at least possible.
> But why that would slow things down so much is another mystery
> but it's worth testing.
>
> Another wild bit would be your documentCache. Is it sized large enough?
> As I remember, the figure is (max docs returned) * (possible number of
> simultaneous requests), see:
> http://wiki.apache.org/solr/SolrCaching#documentCache
>
> Is there any chance that <enableLazyFieldLoading> is false
> in solrconfig.xml? That could account for it.
>
> But I'm afraid it's a matter of trying to remove stuff from your
> process until something changes because this is pretty
> surprising...
>
> Best
> Erick
>
> On Thu, Feb 23, 2012 at 4:44 PM, naptowndev <[hidden 
> email]<http://user/SendEmail.jtp?type=node&node=3771562&i=0>>
> wrote:
>
> > Erick -
> >
> > Thanks.  We've actually worked with Sematext to optimize the GC settings
> > and saw initial (and continued) performance boosts as a result...
> >
> > The situation we're seeing now, has both versions of Solr running on the
> > same box under the same JVM, but we are undeploying an instance at a
> time
> > so as to prevent any outlying performance hits in the tests...
> >
> > So, that being said, both instances of solr, on the same box are running
> > under the optimized settings.  I'd assume if GC was impacting the
> results
> > of the newer version of Solr, we'd see similar decrease in performance
> on
> > the older version.
> >
> > Aside from the QTime and other timings (highlight, etc) - which are all
> > faster in the new version, the overall response time/delivery of the
> > results are significantly slower under the new version.
> >
> > I've unfortunately exhausted my knowledge of Solr and what may or may
> not
> > have changed between the nightly builds.
> >
> > I do appreciate your insight and hope you'll continue to throw out some
> > ideas...and maybe someone else out there has seen these inconsistencies
> as
> > well.
> >
> > The last set of test I ran consistently showed the the older build of
> Solr
> > bringing back a result set of 13.1MB with 1200 records in 2.3 seconds
> > wheres the newer build was bringing back the same result set in about
> 17.4
> > seconds.  The catch is that the qtime and highlighting component time in
> > the newer version are faster than the older version.
> >
> > Again, if you have any more ideas, let me know.
> >
> > Thanks!
> > Brian
> >
> > On Thu, Feb 23, 2012 at 11:51 AM, Erick Erickson [via Lucene] <
> > [hidden email] <http://user/SendEmail.jtp?type=node&node=3771562&i=1>>
> wrote:
> >
> >> Ah, no, my mistake. The wildcards for the fl list won't matter re:
> >> maxBooleanClauses,
> >> I didn't read carefully enough.
> >>
> >> I assume that just returning a field or two doesn't slow down....
> >>
> >> But one possible culprit, especially since you say this kicks in after
> >> a while, is garbage collection. Here's an excellent intro:
> >>
> >>
> >>
> http://www.lucidimagination.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/
> >>
> >> Especially look at the "getting a view into garbage collection"
> >> section and try specifying
> >> those options. The result should be that your solr log gets stats
> >> dumped every time
> >> GC kicks in. If this is a problem, look at the times in the logfile
> >> after your system slows
> >> down. You'll see a bunch of GC dumps that collect very little unused
> >> memory. You can
> >> also connect to the process using jConsole (should be in the Java
> >> distro) and watch
> >> the "memory" tab, especially after your server has slowed down. You can
> >> also
> >> connect jConsole remotely...
> >>
> >> This is just an experiment, but any time I see "and it slows down
> >> after ### minutes",
> >> GC is the first thing I think of.
> >>
> >>
> >> Best
> >> Erick
> >>
> >>
> >> On Thu, Feb 23, 2012 at 10:16 AM, naptowndev <[hidden email]<
> http://user/SendEmail.jtp?type=node&node=3770307&i=0>>
> >> wrote:
> >>
> >> > Erick -
> >> >
> >> > Agreed, it is puzzling.
> >> >
> >> > What I've found is that it doesn't matter if I pass in wildcards for
> the
> >> > field list or not...but that the overall response time from the newer
> >> builds
> >> > of Solr that we've tested (e.g. 4.0.0.2012.02.16) is slower than the
> >> older
> >> > (4.0.0.2010.12.10.08.54.56) build.
> >> >
> >> > If I run the exact same query against those two cores, bringing back
> a
> >> > payload of just over 13MB (xml), the older build brings it back in
> about
> >> 1.6
> >> > seconds and the newer build brings it back in about 8.4 seconds.
> >> >
> >> > Implementing the field list wildcard allows us to reduce the payload
> in
> >> the
> >> > newer build (not an option in the older build).  They payload is
> reduced
> >> to
> >> > 1.8MB but takes over 3.5 seconds to come back as compared to the full
> >> > payload (13MB) in the older build at about 1.6 seconds.
> >> >
> >> > With everything else remaining the same
> >> (machine/processors/memory/network
> >> > and the code base calling Solr) it seems to point to something in the
> >> newer
> >> > builds that's causing the slowdown, but I'm not intimate enough with
> >> Solr to
> >> > be able to figure that out.
> >> >
> >> > We are using the &debugQuery=on in our test to see timings and they
> >> aren't
> >> > showing any anomalies, so that makes it even more confusing.
> >> >
> >> > From a wildcard perspective, it's on the fl parameter... here's a
> >> 'snippet'
> >> > of part of our fl parameter for the query....
> >> >
> >> > &fl=id, CategoryGroupTypeID, MedicalSpecialtyDescription,
> >> TermsMisspelled,
> >> > DictionarySource, timestamp, Category_*_MemberReports,
> >> > Category_*_MemberReportRange, Category_*_NonMemberReports,
> >> Category_*_Grade,
> >> > Category_*_GradeDisplay, Category_*_GradeTier,
> >> Category_*_ReportLocations,
> >> > Category_*_ReportLocationCoordinates, Category_*_coordinate, score
> >> >
> >> > Please note that that fl param is greatly reduced from our full
> query,
> >> we
> >> > have over 100 static files and a slew of dynamic fields - but that
> >> should
> >> > give you an idea of how we are using wildcards.
> >> >
> >> > I'm not sure about the maxBooleanClauses...not being all that
> familiar
> >> with
> >> > Solr, does that apply to wildcards used in the fl list?
> >> >
> >> > Thanks!
> >> >
> >> > --
> >> > View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3769995.html
> >>
> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >> ------------------------------
> >>  If you reply to this email, your message will be added to the
> discussion
> >> below:
> >>
> >>
> http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770307.html
> >>  To unsubscribe from Solr Performance Improvement and degradation Help,
> click
> >> here<
>
> >> .
> >> NAML<
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
> >>
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770939.html
>
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3771562.html
>  To unsubscribe from Solr Performance Improvement and degradation Help, click
> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3767015&code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng==>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3772837.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to