I have a situation where I am trying to setup a once daily cron job on the 
master node to delete old documents from the index based on our retention 
policy.

I delete only 1days worth of data based on my schema which deletes couple of 
1000 docs and not more. This is a test cluster and the doc counts and size is 
not very high: Num Docs:515727; Max Doc:591322; Heap Memory Usage:-1; Deleted 
Docs:75595 And Index Version Gen Size Master (Searching) 1548694802284 51396 
969.28 MB Master (Replicable) 1548694802284 51396 - Slave (Searching) 
1548694802284 51396 969.28 MB

Sometimes I notice the replication hangs. No errors but it is trying to 
download a segments_* file (e.g. segments_1bnx7) and just sits there. No logs. 
I am unable to stop replication (using abortfetch) once it reaches this state. 
Disable polling works (which is set to 60 seconds) but that doesn't help. The 
only thing that helps is a service solr stop/start. Then the next poll works, 
and the slave version/gen/size/doc count/deleted counts matches the master.

Not every delete cron execution hangs. The segment file I notice being 
downloaded during the “hung” state is no longer available in the master. The 
master has already created a new segment* file.

The cron job basically does this (min and max are a day dange):
    DELETE="\"started:[${MINDATE} TO ${MAXDELDATE}]\""
     /opt/solr/bin/post -c <corename> -type application/json -out yes -commit 
yes -d {delete:{query:"$DELETE"}}

Any ideas?

Thanks.
Ravi

Reply via email to