[jira] [Commented] (SOLR-3443) Optimize hunspell dictionary loading with multiple cores
[ https://issues.apache.org/jira/browse/SOLR-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052878#comment-15052878 ] Gus Heck commented on SOLR-3443: Was working on something else and thinking about memory consistency when it occurred to me that this patch might need a couple of tweaks to the Dictionary class to ensure that it's loading *happens before* any look ups... unless there is some point in the overall solr initialization phase that ensures that request handling threads and the core initialization threads all lock and unlock the same monitor before requests are handled? Does that exist somewhere? Memory consistency seems like something that must have already been thought about... Will think more and look at it tonight. In any case this should not effect the general resource sharing patch in SOLR-8349 unless I decide to add further _caveat emptor_ warnings to the javadoc :). > Optimize hunspell dictionary loading with multiple cores > > > Key: SOLR-3443 > URL: https://issues.apache.org/jira/browse/SOLR-3443 > Project: Solr > Issue Type: Improvement >Reporter: Luca Cavanna > Attachments: SOLR-3443.patch, Screen Shot 2015-11-29 at 9.52.06 AM.png > > > The Hunspell dictionary is actually loaded into memory. Each core using > hunspell loads its own dictionary, no matter if all the cores are using the > same dictionary files. As a result, the same dictionary is loaded into memory > multiple times, once for each core. I think we should share those > dictionaries between all cores in order to optimize the memory usage. In > fact, let's say a dictionary takes 20MB into memory (this is what I > detected), if you have 20 cores you are going to use 400MB only for > dictionaries, which doesn't seem a good idea to me. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3443) Optimize hunspell dictionary loading with multiple cores
[ https://issues.apache.org/jira/browse/SOLR-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269573#comment-13269573 ] Luca Cavanna commented on SOLR-3443: The first thing I have in mind is a static map containing all loaded dictionaries with some kind of unique identifier, so that the same dictionary can be reused between cores. But my question is: is there a mechanism to share object between cores in Solr? Is this the first time someone needs to share something between multiple cores? I'd like to hear your thoughts! Optimize hunspell dictionary loading with multiple cores Key: SOLR-3443 URL: https://issues.apache.org/jira/browse/SOLR-3443 Project: Solr Issue Type: Improvement Reporter: Luca Cavanna The Hunspell dictionary is actually loaded into memory. Each core using hunspell loads its own dictionary, no matter if all the cores are using the same dictionary files. As a result, the same dictionary is loaded into memory multiple times, once for each core. I think we should share those dictionaries between all cores in order to optimize the memory usage. In fact, let's say a dictionary takes 20MB into memory (this is what I detected), if you have 20 cores you are going to use 400MB only for dictionaries, which doesn't seem a good idea to me. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3443) Optimize hunspell dictionary loading with multiple cores
[ https://issues.apache.org/jira/browse/SOLR-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269607#comment-13269607 ] Chris Male commented on SOLR-3443: -- Good suggestion Luca, this is a good way to save some memory. With LUCENE-2510 I'm going to move all the analysis factories into the analysis module so we will need a way that we can share across multiple Factory instances, not just Solr cores. Optimize hunspell dictionary loading with multiple cores Key: SOLR-3443 URL: https://issues.apache.org/jira/browse/SOLR-3443 Project: Solr Issue Type: Improvement Reporter: Luca Cavanna The Hunspell dictionary is actually loaded into memory. Each core using hunspell loads its own dictionary, no matter if all the cores are using the same dictionary files. As a result, the same dictionary is loaded into memory multiple times, once for each core. I think we should share those dictionaries between all cores in order to optimize the memory usage. In fact, let's say a dictionary takes 20MB into memory (this is what I detected), if you have 20 cores you are going to use 400MB only for dictionaries, which doesn't seem a good idea to me. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org