Hello Paul,

thank you for your feedback. I will ask to add an expiration date to the DB
and run a process that updates the index accordingly.

Cheers,
Giovanni


On 3/18/09, Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@gmail.com> wrote:
>
> it is not possible to query details from Solr and find out deleted
> items using DIH
>
> you must maintain a deleted rows ids in the db or just flag them as
> deleted.
>
> --Noble
>
>
>
> On Wed, Mar 18, 2009 at 2:46 PM, Giovanni De Stefano
> <giovanni.destef...@gmail.com> wrote:
> > Hello Paul,
> >
> > thank you for your reply.
> >
> > The UPDATE in fact works fine: I only had to update the CREATION_TIME on
> the
> > DB :-)
> >
> > Regarding the deletedPkQuery, I understand it has to return the primary
> keys
> > that should be removed from the index (because they have been removed
> from
> > the DB) but I don't have any "deleted" flag on the DB.
> >
> > Basically the deletedPkQuery should be something like "select URI *
> > from_the_current_index* where URI is not in (select URI from TEST)"
> >
> > That is returning a subset of primary keys currently in the index and
> that
> > are not in the DB anymore. Is this possible?
> >
> > I am no DB expert...so ANY tip is very welcome!
> >
> > Thanks,
> > Giovanni
> >
> >
> > On 3/18/09, Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@gmail.com> wrote:
> >>
> >> are you sure your schema.xml has a <uniqueKey> field to UPDATE docs.
> >>
> >> to remove deleted docs you must have deletedPkQuery attribute in the
> root
> >> entity
> >>
> >> On Tue, Mar 17, 2009 at 8:48 PM, Giovanni De Stefano
> >> <giovanni.destef...@gmail.com> wrote:
> >> > Hello all,
> >> >
> >> > I have a table TEST in an Oracle DB with the following columns: URI
> >> > (varchar), CONTENT (varchar), CREATION_TIME (date).
> >> >
> >> > The primary key both in the DB and Solr is URI.
> >> >
> >> > Here is my data-config.xml:
> >> >
> >> > <dataConfig>
> >> >  <dataSource
> >> >    driver="oracle.jdbc.driver.OracleDriver"
> >> >    url="jdbc:oracle:thin:@localhost:1521/XE"
> >> >    user="username"
> >> >    password="password"
> >> >  />
> >> >  <document name="Test">
> >> >    <entity
> >> >        name="test_item"
> >> >        pk="URI"
> >> >        query="select URI,CONTENT from TEST"
> >> > *        deltaQuery="select URI,CONTENT from TEST where
> >> > TO_CHAR(CREATION_TIME,'YYYY-MM-DD HH:MI:SS') >
> >> > '${dataimporter.last_index_time}'" *
> >> >    >
> >> >      <field column="URI" name="uri"/>
> >> >      <field column="CONTENT" name="content"/>
> >> >    </entity>
> >> >  </document>
> >> > </dataConfig>
> >> >
> >> > The problem is that anytime I perform a delta-import, the index keeps
> >> being
> >> > populated as if new documents were added. In other words, I am not
> able
> >> to
> >> > UPDATE an existing document or REMOVE a document that is not anymore
> in
> >> the
> >> > DB.
> >> >
> >> > What am I missing? How should I specify my deltaQuery?
> >> >
> >> > Thanks a lot in advance!
> >> >
> >> > Giovanni
> >> >
> >>
> >>
> >>
> >> --
> >> --Noble Paul
> >>
> >
>
>
>
> --
> --Noble Paul
>

Reply via email to