it is not possible to query details from Solr and find out deleted items using DIH
you must maintain a deleted rows ids in the db or just flag them as deleted. --Noble On Wed, Mar 18, 2009 at 2:46 PM, Giovanni De Stefano <giovanni.destef...@gmail.com> wrote: > Hello Paul, > > thank you for your reply. > > The UPDATE in fact works fine: I only had to update the CREATION_TIME on the > DB :-) > > Regarding the deletedPkQuery, I understand it has to return the primary keys > that should be removed from the index (because they have been removed from > the DB) but I don't have any "deleted" flag on the DB. > > Basically the deletedPkQuery should be something like "select URI * > from_the_current_index* where URI is not in (select URI from TEST)" > > That is returning a subset of primary keys currently in the index and that > are not in the DB anymore. Is this possible? > > I am no DB expert...so ANY tip is very welcome! > > Thanks, > Giovanni > > > On 3/18/09, Noble Paul നോബിള് नोब्ळ् <noble.p...@gmail.com> wrote: >> >> are you sure your schema.xml has a <uniqueKey> field to UPDATE docs. >> >> to remove deleted docs you must have deletedPkQuery attribute in the root >> entity >> >> On Tue, Mar 17, 2009 at 8:48 PM, Giovanni De Stefano >> <giovanni.destef...@gmail.com> wrote: >> > Hello all, >> > >> > I have a table TEST in an Oracle DB with the following columns: URI >> > (varchar), CONTENT (varchar), CREATION_TIME (date). >> > >> > The primary key both in the DB and Solr is URI. >> > >> > Here is my data-config.xml: >> > >> > <dataConfig> >> > <dataSource >> > driver="oracle.jdbc.driver.OracleDriver" >> > url="jdbc:oracle:thin:@localhost:1521/XE" >> > user="username" >> > password="password" >> > /> >> > <document name="Test"> >> > <entity >> > name="test_item" >> > pk="URI" >> > query="select URI,CONTENT from TEST" >> > * deltaQuery="select URI,CONTENT from TEST where >> > TO_CHAR(CREATION_TIME,'YYYY-MM-DD HH:MI:SS') > >> > '${dataimporter.last_index_time}'" * >> > > >> > <field column="URI" name="uri"/> >> > <field column="CONTENT" name="content"/> >> > </entity> >> > </document> >> > </dataConfig> >> > >> > The problem is that anytime I perform a delta-import, the index keeps >> being >> > populated as if new documents were added. In other words, I am not able >> to >> > UPDATE an existing document or REMOVE a document that is not anymore in >> the >> > DB. >> > >> > What am I missing? How should I specify my deltaQuery? >> > >> > Thanks a lot in advance! >> > >> > Giovanni >> > >> >> >> >> -- >> --Noble Paul >> > -- --Noble Paul