Re: New leader/replica solution for HDFS

Joseph Obernberger Thu, 26 Feb 2015 10:07:46 -0800

Great!  Thank you!

I had a 4 shard setup - no replicas. Index size was 2.0TBytes stored inHDFS with each node having approximately 500G of index. I added fourmore shards on four other machines as replicas. One thing that happenedwas the 4 replicas all ran out of HDFS cache size(SnapPull failed: java.lang.RuntimeException: The max direct memory islikely too low. Either increase it (by adding -XX:MaxDirectMemorySize=<size>g -XX:+UseLargePages to your containersstartup args) or disable direct allocation usingsolr.hdfs.blockcache.direct.memory.allocation=false in solrconfig.xml.If you are putting the block cache on the heap, your java heap sizemight not be large enough. Failed allocating)

I was using 160 slabs (20GBytes iof RAM). I dropped the config to 80slabs and restarted the replicas. Two of the replicas came up OK, butthe other 2 have stayed in 'Recovering'. I stopped those two andrestarted them - now I have 3 OK, but one is still in Recovering.

Given that each replica does indexing as well, I was expecting theamount of HDFS disk usage to double, but that has not happened. Once Iget the last replica to come up, I'll run some tests.


-Joe

On 2/26/2015 10:45 AM, Mark Miller wrote:

I’ll be working on this at some point: 
https://issues.apache.org/jira/browse/SOLR-6237

- Mark

http://about.me/markrmiller

On Feb 25, 2015, at 2:12 AM, longsan <longsan...@sina.com> wrote:

We used HDFS as our Solr index storage and we really have a heavy update
load. We had met much problems with current leader/replica solution. There
is duplicate index computing on Replilca side. And the data sync between
leader/replica is always a problem.

As HDFS already provides data replication on data layer, could Solr provide
just service layer replication?

My thought is that the leader and the replica all bind to the same data
index directory. And the leader will build up index for new request, the
replica will just keep update the index version with the leader(such as a
soft commit periodically? ). If the leader lost then the replica will take
the duty immediately.

Thanks for any suggestion of this idea.







--
View this message in context: 
http://lucene.472066.n3.nabble.com/New-leader-replica-solution-for-HDFS-tp4188735.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: New leader/replica solution for HDFS

Reply via email to