[ 
https://issues.apache.org/jira/browse/SOLR-11718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323498#comment-16323498
 ] 

Amrit Sarkar commented on SOLR-11718:
-------------------------------------

Modified patch with Varun's recommendation: {{SOLR-11718-v3.patch}}. Improved 
documentation and tests.

There is one test method in 
{{CdcrReplicationHandlerTest}}::{{testReplicationWithBufferedUpdates}} which is 
failing at the moment as:

{code}
  [beaster] [00:04:50.322] FAILURE  353s | 
CdcrReplicationHandlerTest.testReplicationWithBufferedUpdates <<<
  [beaster]    > Throwable #1: java.lang.AssertionError: There are still nodes 
recoverying - waited for 330 seconds
  [beaster]    >        at 
__randomizedtesting.SeedInfo.seed([25F2AEF0CD93CBA3:F6FBFEEE88005734]:0)
  [beaster]    >        at org.junit.Assert.fail(Assert.java:93)
  [beaster]    >        at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:185)
  [beaster]    >        at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:140)
  [beaster]    >        at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:135)
  [beaster]    >        at 
org.apache.solr.cloud.cdcr.BaseCdcrDistributedZkTest.waitForRecoveriesToFinish(BaseCdcrDistributedZkTest.java:522)
  [beaster]    >        at 
org.apache.solr.cloud.cdcr.BaseCdcrDistributedZkTest.restartServer(BaseCdcrDistributedZkTest.java:563)
  [beaster]    >        at 
org.apache.solr.cloud.cdcr.CdcrReplicationHandlerTest.testReplicationWithBufferedUpdates(CdcrReplicationHandlerTest.java:228)
{code}

We test in this method that when leader is still receiving updates, follower if 
restarted will buffer the updates and then replay while recovering. In this 
scenario with buffering being disabled, the follower node is always on recovery 
and never becomes active as indexing never stops and follower is always behind 
X no of documents from leader. This is a typical situation where we wait for 
indexing to complete and then restart follower to fetch index from leader and 
become active.

I am still writing smart test for this according to current design, but seems 
like this scenario is no longer valid. Looking forward to thoughts and 
recommendation.

> Deprecate CDCR Buffer APIs
> --------------------------
>
>                 Key: SOLR-11718
>                 URL: https://issues.apache.org/jira/browse/SOLR-11718
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: CDCR
>    Affects Versions: 7.1
>            Reporter: Amrit Sarkar
>             Fix For: master (8.0), 7.3
>
>         Attachments: SOLR-11718-v3.patch, SOLR-11718.patch, SOLR-11718.patch
>
>
> Kindly see the discussion on SOLR-11652.
> Today, if we see the current CDCR documentation page, buffering is "disabled" 
> by default in both source and target. We don't see any purpose served by Cdcr 
> buffering and it is quite an overhead considering it can take a lot heap 
> space (tlogs ptr) and forever retention of tlogs on the disk when enabled. 
> Also today, even if we disable buffer from API on source , considering it was 
> enabled at startup, tlogs are never purged on leader node of shards of 
> source, refer jira: SOLR-11652



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to