Hi,

For our project purposes, we need to store Solr collections on HDFS.  While
exploring the documentation for the same, I have found lucidworks
documentation (
https://doc.lucidworks.com/lucidworks-hdpsearch/3.0.0/Guide-Install-Manual.html#hdfs-specific-changes)
, where it has been mentioned that solr start script can be passed many
arguments while starting. The example provided is as below:

bin/solr start -c
   -z 10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181/solr
   -Dsolr.directoryFactory=HdfsDirectoryFactory
   -Dsolr.lock.type=hdfs
   -Dsolr.hdfs.home=hdfs://sandbox.hortonworks.com:8020/user/solr


What does this actually mean when passing directoryFactory settings for
Solr start script? I was thinking Directory Factory setting is something
that apply only at each collection level i.e. we need to specify within the
solrconfig.xml file *only*.

When the above settings are passed as part of start script, does that mean
whenever a new collection is created, Solr is going to store the indexes in
HDFS? But what if I upload my solrconfig.xml to ZK which contradicts with
this and contains NRTDirectoryFactory setting? Given the above start
script, should / could I skip the directory factory setting section in my
solrconfig.xml with the assumption that the collections are going to be
stored on HDFS *by default*?

This is confusing to me and hence need the expert advice of the community.

Thanks

Reply via email to