I must chime in to clarify something - in case 2, would the source cluster 
eventually start a log reader on its own?   That is, would the CDCR heal over 
time, or would manual action be required?

-----Original Message-----
From: Renaud Delbru [mailto:renaud@siren.solutions] 
Sent: Tuesday, June 14, 2016 4:51 AM
To: solr-user@lucene.apache.org
Subject: Re: Regarding CDCR SOLR 6

Hi Bharath,

The buffer is useful when you need to buffer updates on the source cluster 
before starting cdcr, if the source cluster might receive updates in the 
meanwhile and you want to be sure to not miss them.

To understand this better, you need to understand how cdcr clean transaction 
logs. Cdcr when started (with the START action) will instantiate a log reader 
for each target cluster. The position of the log reader will indicate cdcr 
which transaction logs it can clean. If all the log readers are beyond a 
certain point, then cdcr can clean all the transaction logs up to this point.

However, there might be cases when the source cluster will be up without any 
log readers instantiated:
1) The source cluster is started, but cdcr is not started yet
2) the source cluster is started, cdcr is started, but the target cluster was 
not accessible when cdcr was started. In this case, cdcr will not be able to 
instantiate a log reader for this cluster.

In these two scenarios, if updates are received by the source cluster, then 
they might be cleaned out from the transaction log as per the normal update log 
cleaning procedure.
That is where the buffer becomes useful. When you know that while starting up 
your clusters and cdcr, you will be in one of these two scenarios, then you can 
activate the buffer to be sure to not miss updates. Then when the source and 
target clusters are properly up and cdcr replication is properly started, you 
can turn off this buffer.

--
Renaud Delbru

On 14/06/16 06:41, Bharath Kumar wrote:
> Hi,
>
> I have setup cross data center replication using solr 6, i want to 
> know why the buffer needs to be enabled on the source cluster? Even if 
> the buffer is not enabled, i am able to replicate the data between 
> source and target sites. What is the advantages of enabling the buffer 
> on the source site? If i enable the buffer, the transaction logs are 
> never deleted and over a period of time we are running out of disk. 
> Can you please let me know why the buffer enabling is required?
>

Reply via email to