On Tue, Jan 12, 2016 at 2:32 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 1/12/2016 6:05 AM, Tom Evans wrote: >> Hi all, trying to move our Solr 4 setup to SolrCloud (5.4). Having >> some problems with a DIH config that attempts to load an XML file and >> iterate through the nodes in that file, it trys to load the file from >> disk instead of from zookeeper. >> >> <entity >> dataSource="lookup_conf" >> rootEntity="false" >> name="lookups" >> processor="XPathEntityProcessor" >> url="lookup_conf.xml" >> forEach="/lookups/lookup"> >> >> The file exists in zookeeper, adjacent to the data_import.conf in the >> lookups_config conf folder. > > SolrCloud puts all the *config* for Solr into zookeeper, and adds a new > abstraction for indexes (the collection), but other parts of Solr like > DIH are not really affected. The entity processors in DIH cannot > retrieve data from zookeeper. They do not know how.
That makes no sense whatsoever. DIH loads the data_import.conf from ZK just fine, or is that provided to DIH from another module that does know about ZK? Either way, it is entirely sub-optimal to have SolrCloud store "all" its configuration in ZK, but still require manually storing and updating files on specific nodes in order to influence DIH. If a server is mistakenly not updated, or manually modified locally on disk, that node would start indexing documents differently than other replicas, which sounds dangerous and scary! If there is not a ZkFileDataSource, it shouldn't be too tricky to add one... I'll see how much I dislike having config files on the host... Cheers Tom