Re: Updating last_modified field when using DIH

2010-11-05 Thread Juan Manuel Alvarez
Stephan, Ephraim. Thanks for the answers!!!
I am finding Solr to be a useful product, but definitely the community
is what makes it a great product!
So far everyone has been very helpful. Thanks!

Cheers!
Juan M.

On Wed, Nov 3, 2010 at 9:13 AM, Ephraim Ofir ephra...@icq.com wrote:
 Also, your deltaImportQuery should be:
 deltaImportQuery='SELECT * FROM Entities WHERE
 ent_id=${dataimporter.delta.id}'

 Otherwise you're just importing the ids and not the rest of the data.

 If performance is important to you, you might also want to check out
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
 c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com
 %3E

 Ephraim Ofir


 -Original Message-
 From: Stefan Matheis [mailto:matheis.ste...@googlemail.com]
 Sent: Wednesday, November 03, 2010 12:58 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Updating last_modified field when using DIH

 Juan,

 that's correct .. solr will not touch your database, that's part of your
 application-code. solr uses an updated timestamp (which is available
 through dataimporter.last_index_time).

 so, image the following situation, solr import runs every 10 minutes ..
 last
 run at 11:00, your entity gets updated at 11:03, next solr-run at 11:10
 will
 detect this as changed, import the entity and run again at 11:20 ..
 then, no
 entity will match the delta-query because solr will ask for a
 modification_date  11:10 (last solr-run at this time).

 you'll only need to update the last_modified field (in your application)
 when the entity is changed and you want solr to (re-)index your data.

 HTH,
 Stefan

 On Tue, Nov 2, 2010 at 7:35 PM, Juan Manuel Alvarez
 naici...@gmail.comwrote:

 Hello everyone!

 I would like to ask you a question about DIH and delta import.

 I am trying to sync Solr with a PostgreSQL database and I have a field
 ent_lastModified of type timestamp without timezone.

 Here is my xml file:

 dataConfig
    dataSource name=jdbc driver=org.postgresql.Driver
 url=jdbc:postgresql://host user=XXX password=XXX readOnly=true
 autoCommit=false
        transactionIsolation=TRANSACTION_READ_COMMITTED
 holdability=CLOSE_CURSORS_AT_COMMIT/
    document
        entity name='myEntity' dataSource='jdbc' pk='id'
                query=' SELECT * FROM Entities'
                deltaImportQuery='SELECT ent_id AS id FROM
 Entities WHERE ent_id=${dataimporter.delta.id}'
          deltaQuery=' SELECT ent_id AS id FROM Entities WHERE
 ent_lastModified gt; #39;${dataimporter.last_index_time}#39;'
                
        /entity
    /document
 /dataConfig

 Full-import works fine, but when I run a delta-import the
 ent_lastModified field, I get the corresponding records, but the
 ent_lastModified stays the same, so if I make another delta-import,
 the same records are retreived.

 I have read all the documentation at
 http://wiki.apache.org/solr/DataImportHandler but I could not find an
 update query for the last_modified field and Solr does not seem to
 do this automatically.
 I have also tried to name the field last_modified as in the example,
 but its value keeps unchanged after a delta-import.

 Can anyone point me in the right direction?

 Thanks in advance!
 Juan M.




Re: Updating last_modified field when using DIH

2010-11-03 Thread Stefan Matheis
Juan,

that's correct .. solr will not touch your database, that's part of your
application-code. solr uses an updated timestamp (which is available
through dataimporter.last_index_time).

so, image the following situation, solr import runs every 10 minutes .. last
run at 11:00, your entity gets updated at 11:03, next solr-run at 11:10 will
detect this as changed, import the entity and run again at 11:20 .. then, no
entity will match the delta-query because solr will ask for a
modification_date  11:10 (last solr-run at this time).

you'll only need to update the last_modified field (in your application)
when the entity is changed and you want solr to (re-)index your data.

HTH,
Stefan

On Tue, Nov 2, 2010 at 7:35 PM, Juan Manuel Alvarez naici...@gmail.comwrote:

 Hello everyone!

 I would like to ask you a question about DIH and delta import.

 I am trying to sync Solr with a PostgreSQL database and I have a field
 ent_lastModified of type timestamp without timezone.

 Here is my xml file:

 dataConfig
dataSource name=jdbc driver=org.postgresql.Driver
 url=jdbc:postgresql://host user=XXX password=XXX readOnly=true
 autoCommit=false
transactionIsolation=TRANSACTION_READ_COMMITTED
 holdability=CLOSE_CURSORS_AT_COMMIT/
document
entity name='myEntity' dataSource='jdbc' pk='id'
query=' SELECT * FROM Entities'
deltaImportQuery='SELECT ent_id AS id FROM
 Entities WHERE ent_id=${dataimporter.delta.id}'
  deltaQuery=' SELECT ent_id AS id FROM Entities WHERE
 ent_lastModified gt; #39;${dataimporter.last_index_time}#39;'

/entity
/document
 /dataConfig

 Full-import works fine, but when I run a delta-import the
 ent_lastModified field, I get the corresponding records, but the
 ent_lastModified stays the same, so if I make another delta-import,
 the same records are retreived.

 I have read all the documentation at
 http://wiki.apache.org/solr/DataImportHandler but I could not find an
 update query for the last_modified field and Solr does not seem to
 do this automatically.
 I have also tried to name the field last_modified as in the example,
 but its value keeps unchanged after a delta-import.

 Can anyone point me in the right direction?

 Thanks in advance!
 Juan M.



RE: Updating last_modified field when using DIH

2010-11-03 Thread Ephraim Ofir
Also, your deltaImportQuery should be:
deltaImportQuery='SELECT * FROM Entities WHERE
ent_id=${dataimporter.delta.id}'

Otherwise you're just importing the ids and not the rest of the data.

If performance is important to you, you might also want to check out
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com
%3E

Ephraim Ofir


-Original Message-
From: Stefan Matheis [mailto:matheis.ste...@googlemail.com] 
Sent: Wednesday, November 03, 2010 12:58 PM
To: solr-user@lucene.apache.org
Subject: Re: Updating last_modified field when using DIH

Juan,

that's correct .. solr will not touch your database, that's part of your
application-code. solr uses an updated timestamp (which is available
through dataimporter.last_index_time).

so, image the following situation, solr import runs every 10 minutes ..
last
run at 11:00, your entity gets updated at 11:03, next solr-run at 11:10
will
detect this as changed, import the entity and run again at 11:20 ..
then, no
entity will match the delta-query because solr will ask for a
modification_date  11:10 (last solr-run at this time).

you'll only need to update the last_modified field (in your application)
when the entity is changed and you want solr to (re-)index your data.

HTH,
Stefan

On Tue, Nov 2, 2010 at 7:35 PM, Juan Manuel Alvarez
naici...@gmail.comwrote:

 Hello everyone!

 I would like to ask you a question about DIH and delta import.

 I am trying to sync Solr with a PostgreSQL database and I have a field
 ent_lastModified of type timestamp without timezone.

 Here is my xml file:

 dataConfig
dataSource name=jdbc driver=org.postgresql.Driver
 url=jdbc:postgresql://host user=XXX password=XXX readOnly=true
 autoCommit=false
transactionIsolation=TRANSACTION_READ_COMMITTED
 holdability=CLOSE_CURSORS_AT_COMMIT/
document
entity name='myEntity' dataSource='jdbc' pk='id'
query=' SELECT * FROM Entities'
deltaImportQuery='SELECT ent_id AS id FROM
 Entities WHERE ent_id=${dataimporter.delta.id}'
  deltaQuery=' SELECT ent_id AS id FROM Entities WHERE
 ent_lastModified gt; #39;${dataimporter.last_index_time}#39;'

/entity
/document
 /dataConfig

 Full-import works fine, but when I run a delta-import the
 ent_lastModified field, I get the corresponding records, but the
 ent_lastModified stays the same, so if I make another delta-import,
 the same records are retreived.

 I have read all the documentation at
 http://wiki.apache.org/solr/DataImportHandler but I could not find an
 update query for the last_modified field and Solr does not seem to
 do this automatically.
 I have also tried to name the field last_modified as in the example,
 but its value keeps unchanged after a delta-import.

 Can anyone point me in the right direction?

 Thanks in advance!
 Juan M.