[
https://issues.apache.org/jira/browse/SOLR-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496370#comment-13496370
]
Erick Erickson commented on SOLR-1306:
--------------------------------------
Well, the use case here is explicitly that the core information is kept in a
completely extra-solr repository (extra ZK too for that matter). Managing 100K
cores by moving directories around is non-trivial, especially since there will
probably be some system-of-record for where all the information lives anyway.
As it stands, this patch doesn't really affect the way Solr works OOB. It only
comes into play if the people implementing the provider _require_ it (and want
to implement the complexity).
But let me think about this a bit. Are you suggesting that the whole notion of
solr.xml be replaced by some kind of crawl/discovery process? Off the top of my
head, I can imagine a degenerate solr.xml that just lists one or more
directories. Then the load process consists of crawling those directories
looking for cores and loading them, possibly with some kind of configuration
files at the core level. For the 10s of K cores/machine case we don't want to
put the data in solrconfig.xml or anything like that, I'm thinking of something
very much simpler, on the order of a java.properties file. I've skipped
thinking about how to "find a core" or how that plays with using common schemas
to see if this is along the lines you're thinking of "getting meta-data closer
to the index".
It does make the whole coordination issue a lot easier, though. You no longer
have the loose coupling between having core information in solr.xml and then
having to be sure the files/dirs corresponding to what's in solr.xml "just
happen" to map to what's actually on disk.... Moving something from one place
to another would consist of
1> shutting down the servers
2> moving the core directory from one server to another
3> starting up the servers again.
I can imagine doing this a bit differently...
1> copy the core from one server to another
2> issue an unload for the core on the source server
3> issue a create for the core on the dest server
There'd probably have to be some kind of background loading, but we're already
talking about parallelizing multicore loads...
>From an admin perspective, the poor soul trying to maintain this all could
>pretty easily enumerate where all the cores were just by asking each server
>for a list of where things are.
Anyway, is the in the vicinity of "moving the metadata closer to the index"?
> Support pluggable persistence/loading of solr.xml details
> ---------------------------------------------------------
>
> Key: SOLR-1306
> URL: https://issues.apache.org/jira/browse/SOLR-1306
> Project: Solr
> Issue Type: New Feature
> Components: multicore
> Reporter: Noble Paul
> Assignee: Erick Erickson
> Fix For: 4.1
>
> Attachments: SOLR-1306.patch, SOLR-1306.patch, SOLR-1306.patch,
> SOLR-1306.patch
>
>
> Persisting and loading details from one xml is fine if the no:of cores are
> small and the no:of cores are few/fixed . If there are 10's of thousands of
> cores in a single box adding a new core (with persistent=true) becomes very
> expensive because every core creation has to write this huge xml.
> Moreover , there is a good chance that the file gets corrupted and all the
> cores become unusable . In that case I would prefer it to be stored in a
> centralized DB which is backed up/replicated and all the information is
> available in a centralized location.
> We may need to refactor CoreContainer to have a pluggable implementation
> which can load/persist the details . The default implementation should
> write/read from/to solr.xml . And the class should be pluggable as follows in
> solr.xml
> {code:xml}
> <solr>
> <dataProvider class="com.foo.FooDataProvider" attr1="val1" attr2="val2"/>
> </solr>
> {code}
> There will be a new interface (or abstract class ) called SolrDataProvider
> which this class must implement
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]