Hi everyone,

Large scale Solr installations often require cross data-center replication
in order to achieve data replication for both, access latency reasons as
well as disaster recovery. In the past users have either designed their own
solutions to deal with this or have tried to rely on the now-deprecated
CDCR.


It would be really good to have support for cross data-center replication
within Solr, that is offered and supported by the community. This would
allow the effort around this shared problem to converge.


I’d like to propose a new solution based on my experiences at my day job.
The key points about this approach:

   1. Uses an external, configurable, messaging system in the middle for
   actual replication/mirroring.
   2. We offer an abstraction and some default implementations based on
   what we can support and what users really want. An example here would be
   Kafka.
   3. This would be a separate repository allowing it to have its own
   release cadence. We shouldn’t have to release this with every Solr release
   as the overlap is just limited to SolrJ interactions.


I’ll share a more detailed and evolving document soon with the design for
everyone else to contribute to but wanted to share this as I’m starting to
work on this and wanted to avoid parallel efforts towards the same end-goal.

-- 
Anshum Gupta

Reply via email to