In attempting to stress-test CDCR (running Solr 7.4), I am running into a couple of issues.
One is that the tlog files keep accumulating for some nodes in the CDCR system, particularly for the non-Leader nodes in the Source SolrCloud. No quantity of hard commits seem to cause any of these tlog files to be released. This can become a problem upon reboot if there are hundreds of thousands of tlog files, and Solr fails to start (complaining that there are too many open files). The tlogs had been accumulating on all the nodes of the CDCR set of SolrClouds until I added these two lines to the solrconfig.xml file (for testing purposes, using numbers much lower than in the examples): <int name="numRecordsToKeep">5</int> <int name="maxNumLogsToKeep">2</int> Since then, it is mostly the non-Leader nodes of the Source SolrCloud which accumulates tlog files (the Target SolrCloud does seem to have a tendency to clean up the tlog files, as does the Leader of the Source SolrCloud). If I use ADDREPLICAPROP and REBALANCELEADERS to change which node is the Leader, and if I then start adding more data, the tlogs on the new Leader sometimes will go away, but then the old Leader begins accumulating tlog files. I am dubious whether frequent reassignment of Leadership would be a practical solution. I also have several times attempted to simulate a production environment by running several loops simultaneously, each of which inserts multiple records on each iteration of the loop. Several times, I end up with a dozen records on (both replicas of) the Source which never make it to (either replica of) the Target. The Target has thousands of records which were inserted before the missing records, and thousands of records which were inserted after the missing records (and all these records, the replicated and the missing, were inserted by curl commands which only differed in sequential numbers incorporated into the values being inserted). I also have a question regarding SOLR-13141: the 11/Feb/19 comment says that the fix for Solr 7.3 had a problem; and the header says "Affects Version/s: 7.5, 7.6": does that indicate that Solr 7.4 is not affected? Are there any suggestions? Thanks