Hi all,

We recently upgraded to Solr 7.2.0 as we saw that there were some CDCR bug
fixes and features added that would finally let us be able to make use of
it (bi-directional syncing was the big one). The first time we tried to
implement we ran into all kinds of errors, but this time we were able to
get it mostly working.

The issue we seem to be having now is that any time a document is deleted
via deleteById from a collection on the primary node, we are flooded with
"Invalid Number" errors followed by a random sequence of characters when
CDCR tries to sync the update to the backup site. This happens on all of
our collections where our id fields are defined as longs (some of them the
ids are compound keys and are strings).

Here's a sample exception:

org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
from server at http://ip/solr/collection_shard1_replica_n1: Invalid
Number:  ]
-s
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:549)
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1012)
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:883)
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:945)
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:945)
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:945)
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:945)
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:945)
        at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:816)
        at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
        at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
        at
org.apache.solr.handler.CdcrReplicator.sendRequest(CdcrReplicator.java:140)
        at
org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:104)
        at
org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(CdcrReplicatorScheduler.java:81)
        at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


I'm scratching my head as to the cause of this. It's like it is trying to
deleteById for the value "]", even though that is not the ID for the
document that was deleted from the primary. So I don't know if it is
pulling this from the wrong field somehow or where that value if coming
from.

I found this issue: https://issues.apache.org/jira/browse/SOLR-9394 which
looks related, but doesn't look like it has any traction.

Has anyone else experienced this issue with CDCR, or have any ideas as to
what could be causing this issue?

Thanks,

Chris

Reply via email to