Great! Thank you!
I had a 4 shard setup - no replicas. Index size was 2.0TBytes stored in
HDFS with each node having approximately 500G of index. I added four
more shards on four other machines as replicas. One thing that happened
was the 4 replicas all ran out of HDFS cache size
(SnapPull failed: java.lang.RuntimeException: The max direct memory is
likely too low. Either increase it (by adding -
XX:MaxDirectMemorySize=<size>g -XX:+UseLargePages to your containers
startup args) or disable direct allocation using
solr.hdfs.blockcache.direct.memory.allocation=false in solrconfig.xml.
If you are putting the block cache on the heap, your java heap size
might not be large enough. Failed allocating)
I was using 160 slabs (20GBytes iof RAM). I dropped the config to 80
slabs and restarted the replicas. Two of the replicas came up OK, but
the other 2 have stayed in 'Recovering'. I stopped those two and
restarted them - now I have 3 OK, but one is still in Recovering.
Given that each replica does indexing as well, I was expecting the
amount of HDFS disk usage to double, but that has not happened. Once I
get the last replica to come up, I'll run some tests.
-Joe
On 2/26/2015 10:45 AM, Mark Miller wrote:
I’ll be working on this at some point:
https://issues.apache.org/jira/browse/SOLR-6237
- Mark
http://about.me/markrmiller
On Feb 25, 2015, at 2:12 AM, longsan <longsan...@sina.com> wrote:
We used HDFS as our Solr index storage and we really have a heavy update
load. We had met much problems with current leader/replica solution. There
is duplicate index computing on Replilca side. And the data sync between
leader/replica is always a problem.
As HDFS already provides data replication on data layer, could Solr provide
just service layer replication?
My thought is that the leader and the replica all bind to the same data
index directory. And the leader will build up index for new request, the
replica will just keep update the index version with the leader(such as a
soft commit periodically? ). If the leader lost then the replica will take
the duty immediately.
Thanks for any suggestion of this idea.
--
View this message in context:
http://lucene.472066.n3.nabble.com/New-leader-replica-solution-for-HDFS-tp4188735.html
Sent from the Solr - User mailing list archive at Nabble.com.