Been using Solr on HDFS for a while now, and I'm seeing an issue with
redundancy/reliability. If a server goes down, when it comes back up,
it will never recover because of the lock files in HDFS. That solr node
needs to be brought down manually, the lock files deleted, and then
brought back up. At that point, it appears to copy all the data for its
replicas. If the index is large, and new data is being indexed, in some
cases it will never recover. The replication retries over and over.
How can we make a reliable Solr Cloud cluster when using HDFS that can
handle servers coming and going?
Thank you!
-Joe
- Solr on HDFS Joe Obernberger
-