[jira] [Commented] (SOLR-3443) Optimize hunspell dictionary loading with multiple cores

2015-12-11 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052878#comment-15052878
 ] 

Gus Heck commented on SOLR-3443:


Was working on something else and thinking about memory consistency when it 
occurred to me that this patch might need a couple of tweaks to the Dictionary 
class to ensure that it's loading *happens before* any look ups... unless there 
is some point in the overall solr initialization phase that ensures that 
request handling threads and the core initialization threads all lock and 
unlock the same monitor before requests are handled? Does that exist somewhere? 
Memory consistency seems like something that must have already been thought 
about...  Will think more and look at it tonight.

In any case this should not effect the general resource sharing patch in 
SOLR-8349 unless I decide to add further _caveat emptor_ warnings to the 
javadoc :).

> Optimize hunspell dictionary loading with multiple cores
> 
>
> Key: SOLR-3443
> URL: https://issues.apache.org/jira/browse/SOLR-3443
> Project: Solr
>  Issue Type: Improvement
>Reporter: Luca Cavanna
> Attachments: SOLR-3443.patch, Screen Shot 2015-11-29 at 9.52.06 AM.png
>
>
> The Hunspell dictionary is actually loaded into memory. Each core using 
> hunspell loads its own dictionary, no matter if all the cores are using the 
> same dictionary files. As a result, the same dictionary is loaded into memory 
> multiple times, once for each core. I think we should share those 
> dictionaries between all cores in order to optimize the memory usage. In 
> fact, let's say a dictionary takes 20MB into memory (this is what I 
> detected), if you have 20 cores you are going to use 400MB only for 
> dictionaries, which doesn't seem a good idea to me.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3443) Optimize hunspell dictionary loading with multiple cores

2012-05-07 Thread Luca Cavanna (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269573#comment-13269573
 ] 

Luca Cavanna commented on SOLR-3443:


The first thing I have in mind is a static map containing all loaded 
dictionaries with some kind of unique identifier, so that the same dictionary 
can be reused between cores.
But my question is: is there a mechanism to share object between cores in Solr? 
Is this the first time someone needs to share something between multiple cores?
I'd like to hear your thoughts!

 Optimize hunspell dictionary loading with multiple cores
 

 Key: SOLR-3443
 URL: https://issues.apache.org/jira/browse/SOLR-3443
 Project: Solr
  Issue Type: Improvement
Reporter: Luca Cavanna

 The Hunspell dictionary is actually loaded into memory. Each core using 
 hunspell loads its own dictionary, no matter if all the cores are using the 
 same dictionary files. As a result, the same dictionary is loaded into memory 
 multiple times, once for each core. I think we should share those 
 dictionaries between all cores in order to optimize the memory usage. In 
 fact, let's say a dictionary takes 20MB into memory (this is what I 
 detected), if you have 20 cores you are going to use 400MB only for 
 dictionaries, which doesn't seem a good idea to me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3443) Optimize hunspell dictionary loading with multiple cores

2012-05-07 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269607#comment-13269607
 ] 

Chris Male commented on SOLR-3443:
--

Good suggestion Luca, this is a good way to save some memory.  With LUCENE-2510 
I'm going to move all the analysis factories into the analysis module so we 
will need a way that we can share across multiple Factory instances, not just 
Solr cores.

 Optimize hunspell dictionary loading with multiple cores
 

 Key: SOLR-3443
 URL: https://issues.apache.org/jira/browse/SOLR-3443
 Project: Solr
  Issue Type: Improvement
Reporter: Luca Cavanna

 The Hunspell dictionary is actually loaded into memory. Each core using 
 hunspell loads its own dictionary, no matter if all the cores are using the 
 same dictionary files. As a result, the same dictionary is loaded into memory 
 multiple times, once for each core. I think we should share those 
 dictionaries between all cores in order to optimize the memory usage. In 
 fact, let's say a dictionary takes 20MB into memory (this is what I 
 detected), if you have 20 cores you are going to use 400MB only for 
 dictionaries, which doesn't seem a good idea to me.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org