[
https://issues.apache.org/jira/browse/SOLR-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481296#comment-13481296
]
Erick Erickson commented on SOLR-1293:
--------------------------------------
Well, I think this JIRA will finally get some action...
Jose:
The actual availability of any particular feature is best tracked by the actual
JIRA ticket. The "fix version" is usually the earliest _possible_ fix. Not
until the resolution is something like "fixed" is the code really in the code
line.
All:
OK, I'm thinking along these lines. I've started implementation, but wanted to
open up the discussion in case I'm going down the wrong path.
Assumption:
1> For installations with multiple thousands of cores, provision has to me made
for some kind of administrative process, probably an RDBMS that really
maintains this information.
So here's a brief outline of the approach I'm thinking about.
1> Add an additional optional parameter to the <cores> entry in solr.xml,
LRUCacheSize=#. (what default?)
2> Implement SOLR-1306, allow a data provider to be specified in solr.xml that
gives back core descriptions, something like: <coreDescriptorProvider
class="com.foo.FooDataProvider" [attr="val"]/> (don't quite know what attrs we
want, if any).
3> Add two optional attributes to individual <core> entries
a> sticky="true|false". Default to true. Any cores marked with this would
never be aged out, essentially treat them just as current.
b> loadOnStartup="true|false", default to true.
4> so the process of getting a core would be something like
a> check the normal list, just like now. If a core was found, return it.
b> Check the LRU list, if a core was found, return it.
c> ask the dataprovider (if defined) for the core descriptor. create the
core and put it in the LRU list.
d> remove any core entries over the LRU limit. Any hints on the right cache
to use? There's the Lucene LRUCache, ConcurrentLRUCache, the LRUHashMap in
lucene that I can't find in any of the compiled jars....). I've got to close
the core as it's removed.... It _looks_ like I can use ConcurrentLRUCache and
add a listener to close the core when it's removed from the list.
Processing-wise, in the usual case this would cost an extra check each time a
core was fetched. If <a> above failed, we would have to see if the dataprovider
was defined before returning null. I don't think that's onerous, the rest of
the costs would only be incurred when a dataprovider _did_ exist.
But one design decisions here is along these lines. What to do with persistence
and stickiness? Specifically, if the coreDescriptorProvider gives us a core
from, say, an RDBMS, should we allow that core to be persisted into the
solr.xml file if they've set persist="true" in solr.xml? I'm thinking that we
can make this all work with maximum flexibility if we allow the
coreDataProvider to tell us whether we should persist any core currently
loaded....
Anyway, I'll be fleshing this out over the next little while, anybody want to
weigh in?
Erick
> Support for large no:of cores and faster loading/unloading of cores
> -------------------------------------------------------------------
>
> Key: SOLR-1293
> URL: https://issues.apache.org/jira/browse/SOLR-1293
> Project: Solr
> Issue Type: New Feature
> Components: multicore
> Reporter: Noble Paul
> Fix For: 4.1
>
> Attachments: SOLR-1293.patch
>
>
> Solr , currently ,is not very suitable for a large no:of homogeneous cores
> where you require fast/frequent loading/unloading of cores . usually a core
> is required to be loaded just to fire a search query or to just index one
> document
> The requirements of such a system are.
> * Very efficient loading of cores . Solr cannot afford to read and parse and
> create Schema, SolrConfig Objects for each core each time the core has to be
> loaded ( SOLR-919 , SOLR-920)
> * START STOP core . Currently it is only possible to unload a core (SOLR-880)
> * Automatic loading of cores . If a core is present and it is not loaded and
> a request comes for that load it automatically before serving up a request
> * As there are a large no:of cores , all the cores cannot be kept loaded
> always. There has to be an upper limit beyond which we need to unload a few
> cores (probably the least recently used ones)
> * Automatic allotment of dataDir for cores. If the no:of cores is too high al
> the cores' dataDirs cannot live in the same dir. There is an upper limit on
> the no:of dirs you can create in a unix dir w/o affecting performance
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]