[
https://issues.apache.org/jira/browse/SOLR-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662607#comment-16662607
]
Varun Thacker commented on SOLR-12057:
--------------------------------------
Hi Amrit,
Thanks for the patch! Here's some feedback from just the test case
* CdcrWithDiffReplicaTypesTest -> CdcrReplicaTypeTest - Maybe this is enough
to convey the test intention?
* Some unused imports would need to be removed
* Any reason we're hardcoding StandardDirectoryFactory instead of using of
letting the test framework pick one?
* After CdcrTestsUtil.cdcrStart(cluster1SolrClient); do we need to sleep for 2
seconds? When I see the usage of cdcrStart , I see that some usage has a 2s
sleep and some don't .
* Can we simply the variable naming in this loop. It's adding a batch of docs
right? "docs" is esentially how many batches of 100 docs will we index? Maybe
numBatches?
*
{code:java}
int docs = (TEST_NIGHTLY ? 100 : 10);
int numDocs_c1 = 0;
for (int k = 0; k < docs; k++) {
req = new UpdateRequest();
for (; numDocs_c1 < (k + 1) * 100; numDocs_c1++) {
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", "cluster1_" + numDocs_c1);
doc.addField("xyz", numDocs_c1);
req.add(doc);
}
req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
log.info("Adding " + docs + " docs with commit=true, numDocs=" + numDocs_c1);
req.process(cluster1SolrClient);
}{code}
* It would be really cool if we pulled the meat of the test into a separate
method. The method would take two cloud solr client objects ( for the two
clusters ). That way we could test all 3 replica types in the same place by
calling this method. Perhaps consolidate CdcrBidirectionalTest as well?
* I really like how this test checks for all operations to make sure they work
correctly. perhaps we could expand it to add a parent-child document and an
in-place update as well?
> CDCR does not replicate to Collections with TLOG Replicas
> ---------------------------------------------------------
>
> Key: SOLR-12057
> URL: https://issues.apache.org/jira/browse/SOLR-12057
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: CDCR
> Affects Versions: 7.2
> Reporter: Webster Homer
> Priority: Major
> Attachments: SOLR-12057.patch, SOLR-12057.patch, SOLR-12057.patch,
> SOLR-12057.patch, cdcr-fail-with-tlog-pull.patch,
> cdcr-fail-with-tlog-pull.patch
>
>
> We created a collection using TLOG replicas in our QA clouds.
> We have a locally hosted solrcloud with 2 nodes, all our collections have 2
> shards. We use CDCR to replicate the collections from this environment to 2
> data centers hosted in Google cloud. This seems to work fairly well for our
> collections with NRT replicas. However the new TLOG collection has problems.
>
> The google cloud solrclusters have 4 nodes each (3 separate Zookeepers). 2
> shards per collection with 2 replicas per shard.
>
> We never see data show up in the cloud collections, but we do see tlog files
> show up on the cloud servers. I can see that all of the servers have cdcr
> started, buffers are disabled.
> The cdcr source configuration is:
>
> "requestHandler":{"/cdcr":{
> "name":"/cdcr",
> "class":"solr.CdcrRequestHandler",
> "replica":[
> {
>
> "zkHost":"[xxx-mzk01.sial.com:2181|http://xxx-mzk01.sial.com:2181/],[xxx-mzk02.sial.com:2181|http://xxx-mzk02.sial.com:2181/],[xxx-mzk03.sial.com:2181/solr|http://xxx-mzk03.sial.com:2181/solr]",
> "source":"b2b-catalog-material-180124T",
> "target":"b2b-catalog-material-180124T"},
> {
>
> "zkHost":"[yyyy-mzk01.sial.com:2181|http://yyyy-mzk01.sial.com:2181/],[yyyy-mzk02.sial.com:2181|http://yyyy-mzk02.sial.com:2181/],[yyyy-mzk03.sial.com:2181/solr|http://yyyy-mzk03.sial.com:2181/solr]",
> "source":"b2b-catalog-material-180124T",
> "target":"b2b-catalog-material-180124T"}],
> "replicator":{
> "threadPoolSize":4,
> "schedule":500,
> "batchSize":250},
> "updateLogSynchronizer":\{"schedule":60000}}}}
>
> The target configurations in the 2 clouds are the same:
> "requestHandler":{"/cdcr":{ "name":"/cdcr",
> "class":"solr.CdcrRequestHandler", "buffer":{"defaultState":"disabled"}}}
>
> All of our collections have a timestamp field, index_date. In the source
> collection all the records have a date of 2/28/2018 but the target
> collections have a latest date of 1/26/2018
>
> I don't see cdcr errors in the logs, but we use logstash to search them, and
> we're still perfecting that.
>
> We have a number of similar collections that behave correctly. This is the
> only collection that is a TLOG collection. It appears that CDCR doesn't
> support TLOG collections.
>
> It looks like the data is getting to the target servers. I see tlog files
> with the right timestamps. Looking at the timestamps on the documents in the
> collection none of the data appears to have been loaded.In the solr.log I see
> lots of /cdcr messages action=LASTPROCESSEDVERSION,
> action=COLLECTIONCHECKPOINT, and action=SHARDCHECKPOINT
>
> no errors
>
> Target collections autoCommit is set to 60000 I tried sending a commit
> explicitly no difference. cdcr is uploading data, but no new data appears in
> the collection.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]