* Solr Version 7.7. Using Cloud with CDCR * 3 replicas 1 shard on production and disaster recovery
Hi, Last week, I posted a question about tlogs - https://lucene.472066.n3.nabble.com/tlogs-are-not-deleted-td4451323.html#a4451430 I disabled buffer based on the advice, but still, tlogs in "production" are not being deleted. (tlogs in "disaster recovery" nodes are cleaned.) And there is another issue, which I suspect it to be related to the problem that I previously posted. I am having tons of logs from our "disaster recovery" nodes. The log files are building up at an incredibly fast rate with the messages below forever and cpu usage is always 100% every day("production" nodes' cpu usage is normal). It looks like replicating from production server to disaster recovery, but it actually never ends. Is this high cpu usage on disaster recovery nodes be normal? And is tlogs, which is not being cleaned properly, on production nodes related to high cpu usage on dr nodes? *<these are the sample messages from tons of logs in disaster recovert nodes> * 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection1_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection2_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:25:09.817 INFO (qtp404214852-90778) [c:test_collection s:shard1 r:core_node3 x:test_collection_shard1_replica_n1] o.a.s.c.S.Request [test_collection3_shard1_replica_n1] webapp=/solr path=/cdcr params={action=LASTPROCESSEDVERSION&wt=javabin&version=2} status=0 QTime=0 2019-10-28 18:18:11.729 INFO (cdcr-replicator-378-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection1 2019-10-28 18:18:11.730 INFO (cdcr-replicator-282-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection2 2019-10-28 18:18:11.730 INFO (cdcr-replicator-332-thread-1) [ ] o.a.s.h.CdcrReplicator Forwarded 0 updates to target test_collection3 ... *And in the middle of logs, I see the following exception for some of the collections.* 2019-10-28 18:18:11.732 WARN (cdcr-replicator-404-thread-1) [ ] o.a.s.h.CdcrReplicator Failed to forward update request to target: collection_steps java.lang.ClassCastException: java.lang.Long cannot be cast to java.util.List at org.apache.solr.update.CdcrUpdateLog$CdcrLogReader.getVersion(CdcrUpdateLog.java:732) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.update.CdcrUpdateLog$CdcrLogReader.next(CdcrUpdateLog.java:635) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:77) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(CdcrReplicatorScheduler.java:81) ~[solr-core-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:46] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) ~[solr-solrj-7.7.0.jar:7.7.0 8c831daf4eb41153c25ddb152501ab5bae3ea3d5 - jimczi - 2019-02-04 23:23:50] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html