[
https://issues.apache.org/jira/browse/SOLR-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178608#comment-13178608
]
Erick Erickson commented on SOLR-1931:
--------------------------------------
bq: why is it still 39 seconds?
Histograms and collecting the top N terms by frequency. Still gotta spin
through all the terms to collect either statistic. Take that bit out and the
response is less than 0.5 seconds.
39 seconds isn't bad at all for an index this size, and one can still specify
particular fields of interest if the index is more complex than this one. I can
probably be argued out of their importance although it'll take a little doing.
This is really for, from my perspective, troubleshooting at a high level and
that information is valuable.
Besides, I *told* you I had to look it over after a while. I just saw something
horribly trivial that cuts it down to 15 seconds. There's a loop where, after
the histo stuff is collected, we test to see if the current term frequency is
above the threshold of the already-collected items.... and changing it from
if (freq < tiq.minfreq) continue;
to, essentially,
if (freq <= tiq.minfreq) continue;
means that the pathological case of inserting every last <uniqueKey> in the
tracking priority queue doesn't happen. Siiigggh.
Oh, and the patch I'll attach in a couple of minutes actually compiles. I half
cleaned up the stupid recordDocCount parameter by removing the definition, but
not getting it from the parameters. Fella has to go to sleep more often.
Also, this index is a little peculiar in that many of the fields have only a
very few values so YMMV.
> Schema Browser does not scale with large indexes
> ------------------------------------------------
>
> Key: SOLR-1931
> URL: https://issues.apache.org/jira/browse/SOLR-1931
> Project: Solr
> Issue Type: Improvement
> Components: web gui
> Affects Versions: 3.6, 4.0
> Reporter: Lance Norskog
> Assignee: Erick Erickson
> Priority: Minor
> Attachments: SOLR-1931-3x.patch, SOLR-1931-3x.patch,
> SOLR-1931-trunk.patch, SOLR-1931-trunk.patch
>
>
> The Schema Browser JSP by default causes the Luke handler to "scan the
> world". In large indexes this make the UI useless.
> On an index with 64m documents & 8gb of disk space, the Schema Browser took 6
> minutes to open and hogged all disk I/O, making Solr useless.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]