[ https://issues.apache.org/jira/browse/SOLR-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178608#comment-13178608 ]
Erick Erickson commented on SOLR-1931: -------------------------------------- bq: why is it still 39 seconds? Histograms and collecting the top N terms by frequency. Still gotta spin through all the terms to collect either statistic. Take that bit out and the response is less than 0.5 seconds. 39 seconds isn't bad at all for an index this size, and one can still specify particular fields of interest if the index is more complex than this one. I can probably be argued out of their importance although it'll take a little doing. This is really for, from my perspective, troubleshooting at a high level and that information is valuable. Besides, I *told* you I had to look it over after a while. I just saw something horribly trivial that cuts it down to 15 seconds. There's a loop where, after the histo stuff is collected, we test to see if the current term frequency is above the threshold of the already-collected items.... and changing it from if (freq < tiq.minfreq) continue; to, essentially, if (freq <= tiq.minfreq) continue; means that the pathological case of inserting every last <uniqueKey> in the tracking priority queue doesn't happen. Siiigggh. Oh, and the patch I'll attach in a couple of minutes actually compiles. I half cleaned up the stupid recordDocCount parameter by removing the definition, but not getting it from the parameters. Fella has to go to sleep more often. Also, this index is a little peculiar in that many of the fields have only a very few values so YMMV. > Schema Browser does not scale with large indexes > ------------------------------------------------ > > Key: SOLR-1931 > URL: https://issues.apache.org/jira/browse/SOLR-1931 > Project: Solr > Issue Type: Improvement > Components: web gui > Affects Versions: 3.6, 4.0 > Reporter: Lance Norskog > Assignee: Erick Erickson > Priority: Minor > Attachments: SOLR-1931-3x.patch, SOLR-1931-3x.patch, > SOLR-1931-trunk.patch, SOLR-1931-trunk.patch > > > The Schema Browser JSP by default causes the Luke handler to "scan the > world". In large indexes this make the UI useless. > On an index with 64m documents & 8gb of disk space, the Schema Browser took 6 > minutes to open and hogged all disk I/O, making Solr useless. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org