Re: [DISCUSS] Cross Data-Center Replication in Apache Solr

Marcus Eagan Sun, 06 Dec 2020 10:12:31 -0800

Looking forward to the doc. Thanks

Marcus


On Sun, Dec 6, 2020 at 05:47 Erick Erickson <[email protected]> wrote:

> Anshum:
>
> I know I’ve been recommending something like this to clients for a while,
> do you think a call to the community for people who’ve already put
> something in the middle might net us some good info on the lurking
> gremlins? Mind you “recommend” hasn’t actually involved me _doing_ it
> so I don’t have any actual experience there…
>
> But yeah, absolutely +1 for something making this easier for clients...
>
> Erick
>
> > On Dec 5, 2020, at 11:43 AM, Ilan Ginzburg <[email protected]> wrote:
> >
> > That's an interesting initiative Anshum!
> >
> > I can see at least two different approaches here, your mention of SolrJ
> seems to hint at the first one:
> > 1. Get the data as it comes from the client and fork it to local and
> remote data centers,
> > 2. Create (an asynchronous) stream replicating local data center data to
> remote.
> >
> > Option 1 is strongly consistent but adds latency and potentially
> blocking on the critical path.
> > Option 2 could look like remote PULL replicas, might have lower impact
> on the local data center but has to deal with the remote data center always
> being somewhat behind. If the client application can handle that, the
> performance and efficiency gain (as well as simpler implementation? It
> doesn't require another persistence layer) might be worth it...
> >
> > Ilan
> >
> > On Fri, Dec 4, 2020 at 5:24 PM Anshum Gupta <[email protected]>
> wrote:
> > Hi everyone,
> >
> > Large scale Solr installations often require cross data-center
> replication in order to achieve data replication for both, access latency
> reasons as well as disaster recovery. In the past users have either
> designed their own solutions to deal with this or have tried to rely on the
> now-deprecated CDCR.
> >
> > It would be really good to have support for cross data-center
> replication within Solr, that is offered and supported by the community.
> This would allow the effort around this shared problem to converge.
> >
> > I’d like to propose a new solution based on my experiences at my day
> job. The key points about this approach:
> >       • Uses an external, configurable, messaging system in the middle
> for actual replication/mirroring.
> >       • We offer an abstraction and some default implementations based
> on what we can support and what users really want. An example here would be
> Kafka.
> >       • This would be a separate repository allowing it to have its own
> release cadence. We shouldn’t have to release this with every Solr release
> as the overlap is just limited to SolrJ interactions.
> >
> > I’ll share a more detailed and evolving document soon with the design
> for everyone else to contribute to but wanted to share this as I’m starting
> to work on this and wanted to avoid parallel efforts towards the same
> end-goal.
> >
> > --
> > Anshum Gupta
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
> --
Marcus Eagan

Re: [DISCUSS] Cross Data-Center Replication in Apache Solr

Reply via email to