I am guessing your Enterprise system deletes/updates tables in RDBMS, and your SOLR indexes that data. Additionally to that, you have front-end interacting with SOLR and with RDBMS. At front-end level, in case of a search sent to SOLR returning primary keys for data, you may check your database using primary keys returned by SOLR before committing output to end users.

To remove records from an index... best-by performance is to have Master-Slave SOLR instances, remove data from Master SOLR, and commit/synchronize with Slave nightly (when traffic is lowest). SOLR won't be in-sync with database, but you can always retrieve PKs from SOLR, check database for those PKs, and 'filter' output...

--
Thanks,

Fuad Efendi
416-993-2060(cell)
Tokenizer Inc.
==============
http://www.linkedin.com/in/liferay


Quoting sundar shankar <[EMAIL PROTECTED]>:

Hi,
We have an index of courses (about 4 million docs in prod) and we have a nightly that would pick up newly added courses and update the index accordingly. There is another Enterprise system that shares the same table and that could delete data from the table too.

I just want to know what would be the best practice to find out deleted records and remove it from my index. Unfortunately for us, we dont maintain a history of the deleted records and thats a big bane.

Please do advice on what might be the best way to implement this?

-Sundar

_________________________________________________________________
Movies, sports & news! Get your daily entertainment fix, only on live.com
http://www.live.com/?scope=video&form=MICOAL



Reply via email to