[ https://issues.apache.org/jira/browse/SOLR-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662607#comment-16662607 ]
Varun Thacker commented on SOLR-12057: -------------------------------------- Hi Amrit, Thanks for the patch! Here's some feedback from just the test case * CdcrWithDiffReplicaTypesTest -> CdcrReplicaTypeTest - Maybe this is enough to convey the test intention? * Some unused imports would need to be removed * Any reason we're hardcoding StandardDirectoryFactory instead of using of letting the test framework pick one? * After CdcrTestsUtil.cdcrStart(cluster1SolrClient); do we need to sleep for 2 seconds? When I see the usage of cdcrStart , I see that some usage has a 2s sleep and some don't . * Can we simply the variable naming in this loop. It's adding a batch of docs right? "docs" is esentially how many batches of 100 docs will we index? Maybe numBatches? * {code:java} int docs = (TEST_NIGHTLY ? 100 : 10); int numDocs_c1 = 0; for (int k = 0; k < docs; k++) { req = new UpdateRequest(); for (; numDocs_c1 < (k + 1) * 100; numDocs_c1++) { SolrInputDocument doc = new SolrInputDocument(); doc.addField("id", "cluster1_" + numDocs_c1); doc.addField("xyz", numDocs_c1); req.add(doc); } req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true); log.info("Adding " + docs + " docs with commit=true, numDocs=" + numDocs_c1); req.process(cluster1SolrClient); }{code} * It would be really cool if we pulled the meat of the test into a separate method. The method would take two cloud solr client objects ( for the two clusters ). That way we could test all 3 replica types in the same place by calling this method. Perhaps consolidate CdcrBidirectionalTest as well? * I really like how this test checks for all operations to make sure they work correctly. perhaps we could expand it to add a parent-child document and an in-place update as well? > CDCR does not replicate to Collections with TLOG Replicas > --------------------------------------------------------- > > Key: SOLR-12057 > URL: https://issues.apache.org/jira/browse/SOLR-12057 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: CDCR > Affects Versions: 7.2 > Reporter: Webster Homer > Priority: Major > Attachments: SOLR-12057.patch, SOLR-12057.patch, SOLR-12057.patch, > SOLR-12057.patch, cdcr-fail-with-tlog-pull.patch, > cdcr-fail-with-tlog-pull.patch > > > We created a collection using TLOG replicas in our QA clouds. > We have a locally hosted solrcloud with 2 nodes, all our collections have 2 > shards. We use CDCR to replicate the collections from this environment to 2 > data centers hosted in Google cloud. This seems to work fairly well for our > collections with NRT replicas. However the new TLOG collection has problems. > > The google cloud solrclusters have 4 nodes each (3 separate Zookeepers). 2 > shards per collection with 2 replicas per shard. > > We never see data show up in the cloud collections, but we do see tlog files > show up on the cloud servers. I can see that all of the servers have cdcr > started, buffers are disabled. > The cdcr source configuration is: > > "requestHandler":{"/cdcr":{ > "name":"/cdcr", > "class":"solr.CdcrRequestHandler", > "replica":[ > { > > "zkHost":"[xxx-mzk01.sial.com:2181|http://xxx-mzk01.sial.com:2181/],[xxx-mzk02.sial.com:2181|http://xxx-mzk02.sial.com:2181/],[xxx-mzk03.sial.com:2181/solr|http://xxx-mzk03.sial.com:2181/solr]", > "source":"b2b-catalog-material-180124T", > "target":"b2b-catalog-material-180124T"}, > { > > "zkHost":"[yyyy-mzk01.sial.com:2181|http://yyyy-mzk01.sial.com:2181/],[yyyy-mzk02.sial.com:2181|http://yyyy-mzk02.sial.com:2181/],[yyyy-mzk03.sial.com:2181/solr|http://yyyy-mzk03.sial.com:2181/solr]", > "source":"b2b-catalog-material-180124T", > "target":"b2b-catalog-material-180124T"}], > "replicator":{ > "threadPoolSize":4, > "schedule":500, > "batchSize":250}, > "updateLogSynchronizer":\{"schedule":60000}}}} > > The target configurations in the 2 clouds are the same: > "requestHandler":{"/cdcr":{ "name":"/cdcr", > "class":"solr.CdcrRequestHandler", "buffer":{"defaultState":"disabled"}}} > > All of our collections have a timestamp field, index_date. In the source > collection all the records have a date of 2/28/2018 but the target > collections have a latest date of 1/26/2018 > > I don't see cdcr errors in the logs, but we use logstash to search them, and > we're still perfecting that. > > We have a number of similar collections that behave correctly. This is the > only collection that is a TLOG collection. It appears that CDCR doesn't > support TLOG collections. > > It looks like the data is getting to the target servers. I see tlog files > with the right timestamps. Looking at the timestamps on the documents in the > collection none of the data appears to have been loaded.In the solr.log I see > lots of /cdcr messages action=LASTPROCESSEDVERSION, > action=COLLECTIONCHECKPOINT, and action=SHARDCHECKPOINT > > no errors > > Target collections autoCommit is set to 60000 I tried sending a commit > explicitly no difference. cdcr is uploading data, but no new data appears in > the collection. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org