[ 
https://issues.apache.org/jira/browse/SOLR-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178608#comment-13178608
 ] 

Erick Erickson commented on SOLR-1931:
--------------------------------------

bq: why is it still 39 seconds?

Histograms and collecting the top N terms by frequency. Still gotta spin 
through all the terms to collect either statistic. Take that bit out and the 
response is less than 0.5 seconds.

39 seconds isn't bad at all for an index this size, and one can still specify 
particular fields of interest if the index is more complex than this one. I can 
probably be argued out of their importance although it'll take a little doing. 
This is really for, from my perspective, troubleshooting at a high level and 
that information is valuable.

Besides, I *told* you I had to look it over after a while. I just saw something 
horribly trivial that cuts it down to 15 seconds. There's a loop where, after 
the histo stuff is collected, we test to see if the current term frequency is 
above the threshold of the already-collected items.... and changing it from

if (freq < tiq.minfreq) continue;
to, essentially, 
if (freq <= tiq.minfreq) continue;

means that the pathological case of inserting every last <uniqueKey> in the 
tracking priority queue doesn't happen. Siiigggh.

Oh, and the patch I'll attach in a couple of minutes actually compiles. I half 
cleaned up the stupid recordDocCount parameter by removing the definition, but 
not getting it from the parameters. Fella has to go to sleep more often.

Also, this index is a little peculiar in that many of the fields have only a 
very few values so YMMV.


                
> Schema Browser does not scale with large indexes
> ------------------------------------------------
>
>                 Key: SOLR-1931
>                 URL: https://issues.apache.org/jira/browse/SOLR-1931
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 3.6, 4.0
>            Reporter: Lance Norskog
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: SOLR-1931-3x.patch, SOLR-1931-3x.patch, 
> SOLR-1931-trunk.patch, SOLR-1931-trunk.patch
>
>
> The Schema  Browser JSP by default causes the Luke handler to "scan the 
> world". In large indexes this make the UI useless.
> On an index with 64m documents & 8gb of disk space, the Schema Browser took 6 
> minutes to open and hogged all disk I/O, making Solr useless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to