[
https://issues.apache.org/jira/browse/SOLR-11652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265208#comment-16265208
]
Amrit Sarkar commented on SOLR-11652:
-------------------------------------
I had a chance to chat with [~erickerickson], [~varunthacker] to discuss the
significance of "buffering" in CDC replication.
Motivation for buffering in CDCR: listed on SOLR-11069 by Renaud:
_The original goal of the buffer on cdcr is to indeed keep indefinitely the
tlogs until the buffer is deactivated
(https://lucene.apache.org/solr/guide/7_1/cross-data-center-replication-cdcr.html#the-buffer-element.
This was useful for example during maintenance operations, to ensure that the
source cluster will keep all the tlogs until the target clsuter is properly
initialised. In this scenario, one will activate the buffer on the source. The
source will start to store all the tlogs (and does not purge them). Once the
target cluster is initialised, and has register a tlog pointer on the source,
one can deactivate the buffer on the source and the tlog will start to be
purged once they are read by the target cluster._
What I understood looking at the code besides what Renaud explained:
_Buffer is always enabled on non-leader nodes of source. In source DC, sync b/w
leaders and followers is maintained by buffer. If leader goes down, and someone
else picks up, it uses bufferLog to determine the current version point._
Essentially buffering was introduced to remind source that no updates has been
sent over, because target is not ready, or CDCR is not started. The
LastProcessedVersion for source is -1 when buffer enabled, suggesting no
updates has been forwarded and it has to keep track of all tlogs. Once
disabled, it starts to show the correct version which has been replicated to
target.
In Solr 6.2, Bootstrapping is introduced which very well takes care of the
above use-case, i.e. Source is up and running and have already received bunch
of updates / documents and either we have not started CDCR or target is not
available only until now. Whenever CDC replication is started (action=START
invoked), Bootstrap is called implicitly, which copies the entire index folder
(not tlogs) to the target. This is much faster and effective than earlier setup
where all the updates from the beginning were sent to target linearly in batch
size defined in the cdcr config. This earlier setup was achieved by Buffering
(the tlogs from beginning).
Today, if we see the current CDCR documentation page, buffering is "disabled"
by default in both source and target. We don't see any purpose served by Cdcr
buffering and it is quite an overhead considering it can take a lot heap space
(tlogs ptr) and forever retention of tlogs on the disk when enabled. Also
today, even if we disable buffer from API on source , considering it was
enabled at startup, tlogs are never purged on leader node of shards of source,
refer jira: SOLR-11652
We propose to make Buffer state default "DISABLED" in the code
(CdcrBufferManager) and deprecate its APIs (ENABLE / DISABLE buffer). It will
still be running for non-leader nodes on source implicitly and no user
intervention is required whatsoever.
> Cdcr TLogs doesn't get purged for Source collection Leader when Buffer is
> disabled from CDCR API
> ------------------------------------------------------------------------------------------------
>
> Key: SOLR-11652
> URL: https://issues.apache.org/jira/browse/SOLR-11652
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Amrit Sarkar
>
> Cdcr transactions logs doesn't get purged on leader EVER when Buffer DISABLED
> from CDCR API.
> Steps to reproduce:
> 1. Setup source and target collection cluster and START CDCR, BUFFER ENABLED.
> 2. Index bunch of documents into source; make sure we have generated tlogs in
> decent numbers (>20)
> 3. Disable BUFFER via API on source and keep on indexing
> 4. Tlogs starts to get purges on follower nodes of Source, but Leader keeps
> on accumulating ever.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]