bq: And the data sync between leader/replica is always a problem Not quite sure what you mean by this. There shouldn't need to be any synching in the sense that the index gets replicated, the incoming documents should be sent to each node (and indexed to HDFS) as they come in.
bq: There is duplicate index computing on Replilca side. Yes, that's the design of SolrCloud, explicitly to provide data safety. If you instead rely on the leader to index and somehow pull that indexed form to the replica, then you will lose data if the leader goes down before sending the indexed form. bq: My thought is that the leader and the replica all bind to the same data index directory. This is unsafe. They would both then try to _write_ to the same index, which can easily corrupt indexes and/or all but the first one to access the index would be locked out. All that said, the HDFS triple-redundancy compounded with the Solr leaders/replicas redundancy means a bunch of extra storage. You can turn the HDFS replication down to 1, but that has other implications. Best, Erick On Tue, Feb 24, 2015 at 11:12 PM, longsan <longsan...@sina.com> wrote: > We used HDFS as our Solr index storage and we really have a heavy update > load. We had met much problems with current leader/replica solution. There > is duplicate index computing on Replilca side. And the data sync between > leader/replica is always a problem. > > As HDFS already provides data replication on data layer, could Solr provide > just service layer replication? > > My thought is that the leader and the replica all bind to the same data > index directory. And the leader will build up index for new request, the > replica will just keep update the index version with the leader(such as a > soft commit periodically? ). If the leader lost then the replica will take > the duty immediately. > > Thanks for any suggestion of this idea. > > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/New-leader-replica-solution-for-HDFS-tp4188735.html > Sent from the Solr - User mailing list archive at Nabble.com.