it is not possible to query details from Solr and find out deleted
items using DIH

you must maintain a deleted rows ids in the db or just flag them as deleted.

--Noble



On Wed, Mar 18, 2009 at 2:46 PM, Giovanni De Stefano
<giovanni.destef...@gmail.com> wrote:
> Hello Paul,
>
> thank you for your reply.
>
> The UPDATE in fact works fine: I only had to update the CREATION_TIME on the
> DB :-)
>
> Regarding the deletedPkQuery, I understand it has to return the primary keys
> that should be removed from the index (because they have been removed from
> the DB) but I don't have any "deleted" flag on the DB.
>
> Basically the deletedPkQuery should be something like "select URI *
> from_the_current_index* where URI is not in (select URI from TEST)"
>
> That is returning a subset of primary keys currently in the index and that
> are not in the DB anymore. Is this possible?
>
> I am no DB expert...so ANY tip is very welcome!
>
> Thanks,
> Giovanni
>
>
> On 3/18/09, Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@gmail.com> wrote:
>>
>> are you sure your schema.xml has a <uniqueKey> field to UPDATE docs.
>>
>> to remove deleted docs you must have deletedPkQuery attribute in the root
>> entity
>>
>> On Tue, Mar 17, 2009 at 8:48 PM, Giovanni De Stefano
>> <giovanni.destef...@gmail.com> wrote:
>> > Hello all,
>> >
>> > I have a table TEST in an Oracle DB with the following columns: URI
>> > (varchar), CONTENT (varchar), CREATION_TIME (date).
>> >
>> > The primary key both in the DB and Solr is URI.
>> >
>> > Here is my data-config.xml:
>> >
>> > <dataConfig>
>> >  <dataSource
>> >    driver="oracle.jdbc.driver.OracleDriver"
>> >    url="jdbc:oracle:thin:@localhost:1521/XE"
>> >    user="username"
>> >    password="password"
>> >  />
>> >  <document name="Test">
>> >    <entity
>> >        name="test_item"
>> >        pk="URI"
>> >        query="select URI,CONTENT from TEST"
>> > *        deltaQuery="select URI,CONTENT from TEST where
>> > TO_CHAR(CREATION_TIME,'YYYY-MM-DD HH:MI:SS') >
>> > '${dataimporter.last_index_time}'" *
>> >    >
>> >      <field column="URI" name="uri"/>
>> >      <field column="CONTENT" name="content"/>
>> >    </entity>
>> >  </document>
>> > </dataConfig>
>> >
>> > The problem is that anytime I perform a delta-import, the index keeps
>> being
>> > populated as if new documents were added. In other words, I am not able
>> to
>> > UPDATE an existing document or REMOVE a document that is not anymore in
>> the
>> > DB.
>> >
>> > What am I missing? How should I specify my deltaQuery?
>> >
>> > Thanks a lot in advance!
>> >
>> > Giovanni
>> >
>>
>>
>>
>> --
>> --Noble Paul
>>
>



-- 
--Noble Paul

Reply via email to