I prefer a single HDFS home since it definitely simplifies things. No need
to create folders for each node or anything like that if you add nodes to
the cluster. The replicas underneath will get their own folders. I don't
know if there are issues with autoAddReplicas or other types of failovers
if there are different home folders.

I've run Solr on HDFS with the same basic configs as listed here:
https://risdenk.github.io/2018/10/23/apache-solr-running-on-apache-hadoop-hdfs.html

Kevin Risden


On Fri, Nov 2, 2018 at 1:19 PM lstusr 5u93n4 <lstusr...@gmail.com> wrote:

> Hi All,
>
> Here's a question that I can't find an answer to in the documentation:
>
> When configuring solr cloud with HDFS, is it best to:
>   a) provide a unique hdfs folder for each solr cloud instance
> or
>   b) provide the same hdfs folder to all solr cloud instances.
>
> So for example, if I have two solr cloud nodes, I can configure them either
> with:
>
>    node1: -Dsolr.hdfs.home=hdfs://my.hfds:9000/solr/node1
>    node2: -Dsolr.hdfs.home=hdfs://my.hfds:9000/solr/node2
>
> Or I could configure both nodes with:
>
>     -Dsolr.hdfs.home=hdfs://my.hfds:9000/solr
>
> In the second option, all solr cloud nodes can "see" all index files from
> all other solr cloud nodes. Are there pros or cons to allowing the all of
> the solr nodes to see all files in the collection?
>
> Thanks,
>
> Kyle
>

Reply via email to