That should be "flag it in a boolean column". --wunder
On 9/25/08 11:51 AM, "Walter Underwood" <[EMAIL PROTECTED]> wrote: > This will cause the result counts to be wrong and the "deleted" docs > will stay in the search index forever. > > Some approaches for incremental update: > > * full sweep garbage collection: fetch every ID in the Solr DB and > check whether that exists in the source DB, then delete the ones > that don't exist. > > * mark for deletion: change the DB to leave the record but flag it > as deleted in a boolean row, then delete from Solr all deleted > items in the source DB. The items marked for deletion can be > deleted from the source DB at a later time. > > * indexer scratchpad DB: a database used by the indexing code which > shows all the IDs currently in the index, usually with a last modified > time. This is similar to the full sweep, but may be much faster with > a dedicated DB. This can get arbitrarily fancy. Web spiders work like this. > > wunder > > On 9/25/08 10:08 AM, "Fuad Efendi" <[EMAIL PROTECTED]> wrote: > >> I am guessing your Enterprise system deletes/updates tables in RDBMS, >> and your SOLR indexes that data. Additionally to that, you have >> front-end interacting with SOLR and with RDBMS. At front-end level, in >> case of a search sent to SOLR returning primary keys for data, you may >> check your database using primary keys returned by SOLR before >> committing output to end users. >