Dawg,
I have a similar setup, and this is what works for me. I have a field which 
contains a timestamp. The timestamp is set to be identical for all documents 
added/updated in a run. Whe the run is complete and some/many documents have 
been overwritten then I can delete all un-updated documents easily: they have a 
previous timestamp.
Cheers -- Rick


On October 31, 2017 7:54:22 AM EDT, "Emir Arnautović" 
<emir.arnauto...@sematext.com> wrote:
>Hi,
>There is a possibility that you ended up with documents with the same
>ID and that you are overwriting docuements instead of writing new.
>
>In any case, I would suggest you change your approach in case you have
>enough disk space to keep two copies of indices:
>1. use alias to read data from index instead of index name
>2. index data into new index
>3. after verification (e.g. quick check would be number of docs) switch
>alias to new index
>4. keep old index available in case you need to switch back.
>5. before indexing next index, delete one from previous day to free up
>space.
>
>In case you have updates during day you have to account for that as
>well - stop updating while indexing new index; update both indices if
>want to be able to switch back at any point etc.
>
>HTH,
>Emir
>--
>Monitoring - Log Management - Alerting - Anomaly Detection
>Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 31 Oct 2017, at 11:20, o1webdawg <i...@smartweber.com> wrote:
>> 
>> I have an index with about a million documents.  It is the backend
>for a
>> shopping cart system.
>> 
>> Sometimes the inventory gets out of sync with solr and the storefront
>> contains out of stock items.
>> 
>> So I setup a scheduled task on the server to run at 12am every
>morning to
>> delete the entire solr index.
>> 
>> Then at 12:04am I run another scheduled task to re-index the SQL
>database
>> containing the inventory.
>> 
>> Well, today I check it around 4am and only a fraction of the products
>are in
>> the solr index.
>> 
>> However, it did not seem to be idle and reloading it showed lots of
>deleted
>> documents.
>> 
>> 
>> I open up the core and the deletes keep going up, max docs goes up,
>but the
>> total docs stays the same.
>> 
>> It's really confusing me what is happening at this point and why I am
>> viewing these numbers of docs.
>> 
>> My theory is that the 12am delete is still running 5 hours later at
>the same
>> time as the re-indexing.
>> 
>> That's the only way I can explain this really odd behavior with my
>limited
>> knowledge.
>> 
>> Is my theory realistic and could the delete still be running?
>> 
>> 
>> 
>> 
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Reply via email to