Re: Updating last_modified field when using DIH
Stephan, Ephraim. Thanks for the answers!!! I am finding Solr to be a useful product, but definitely the community is what makes it a great product! So far everyone has been very helpful. Thanks! Cheers! Juan M. On Wed, Nov 3, 2010 at 9:13 AM, Ephraim Ofir wrote: > Also, your deltaImportQuery should be: > deltaImportQuery='SELECT * FROM "Entities" WHERE > "ent_id"=${dataimporter.delta.id}"' > > Otherwise you're just importing the ids and not the rest of the data. > > If performance is important to you, you might also want to check out > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3 > c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com > %3E > > Ephraim Ofir > > > -Original Message- > From: Stefan Matheis [mailto:matheis.ste...@googlemail.com] > Sent: Wednesday, November 03, 2010 12:58 PM > To: solr-user@lucene.apache.org > Subject: Re: Updating last_modified field when using DIH > > Juan, > > that's correct .. solr will not touch your database, that's part of your > application-code. solr uses an updated timestamp (which is available > through dataimporter.last_index_time). > > so, image the following situation, solr import runs every 10 minutes .. > last > run at 11:00, your entity gets updated at 11:03, next solr-run at 11:10 > will > detect this as changed, import the entity and run again at 11:20 .. > then, no > entity will match the delta-query because solr will ask for a > modification_date > 11:10 (last solr-run at this time). > > you'll only need to update the last_modified field (in your application) > when the entity is changed and you want solr to (re-)index your data. > > HTH, > Stefan > > On Tue, Nov 2, 2010 at 7:35 PM, Juan Manuel Alvarez > wrote: > >> Hello everyone! >> >> I would like to ask you a question about DIH and delta import. >> >> I am trying to sync Solr with a PostgreSQL database and I have a field >> "ent_lastModified" of type "timestamp without timezone". >> >> Here is my xml file: >> >> >> > url="jdbc:postgresql://host" user="XXX" password="XXX" readOnly="true" >> autoCommit="false" >> transactionIsolation="TRANSACTION_READ_COMMITTED" >> holdability="CLOSE_CURSORS_AT_COMMIT"/> >> >> > query=' SELECT * FROM Entities' >> deltaImportQuery='SELECT "ent_id" AS "id" FROM >> "Entities" WHERE "ent_id"=${dataimporter.delta.id}"' >> deltaQuery=' SELECT "ent_id" AS "id" FROM "Entities" WHERE >> "ent_lastModified" > '${dataimporter.last_index_time}'' >> > >> >> >> >> >> Full-import works fine, but when I run a delta-import the >> "ent_lastModified" field, I get the corresponding records, but the >> "ent_lastModified" stays the same, so if I make another delta-import, >> the same records are retreived. >> >> I have read all the documentation at >> http://wiki.apache.org/solr/DataImportHandler but I could not find an >> update query for the "last_modified" field and Solr does not seem to >> do this automatically. >> I have also tried to name the field "last_modified" as in the example, >> but its value keeps unchanged after a delta-import. >> >> Can anyone point me in the right direction? >> >> Thanks in advance! >> Juan M. >> >
RE: Updating last_modified field when using DIH
Also, your deltaImportQuery should be: deltaImportQuery='SELECT * FROM "Entities" WHERE "ent_id"=${dataimporter.delta.id}"' Otherwise you're just importing the ids and not the rest of the data. If performance is important to you, you might also want to check out http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3 c9f8b39cb3b7c6d4594293ea29ccf438b01702...@icq-mail.icq.il.office.aol.com %3E Ephraim Ofir -Original Message- From: Stefan Matheis [mailto:matheis.ste...@googlemail.com] Sent: Wednesday, November 03, 2010 12:58 PM To: solr-user@lucene.apache.org Subject: Re: Updating last_modified field when using DIH Juan, that's correct .. solr will not touch your database, that's part of your application-code. solr uses an updated timestamp (which is available through dataimporter.last_index_time). so, image the following situation, solr import runs every 10 minutes .. last run at 11:00, your entity gets updated at 11:03, next solr-run at 11:10 will detect this as changed, import the entity and run again at 11:20 .. then, no entity will match the delta-query because solr will ask for a modification_date > 11:10 (last solr-run at this time). you'll only need to update the last_modified field (in your application) when the entity is changed and you want solr to (re-)index your data. HTH, Stefan On Tue, Nov 2, 2010 at 7:35 PM, Juan Manuel Alvarez wrote: > Hello everyone! > > I would like to ask you a question about DIH and delta import. > > I am trying to sync Solr with a PostgreSQL database and I have a field > "ent_lastModified" of type "timestamp without timezone". > > Here is my xml file: > > > url="jdbc:postgresql://host" user="XXX" password="XXX" readOnly="true" > autoCommit="false" >transactionIsolation="TRANSACTION_READ_COMMITTED" > holdability="CLOSE_CURSORS_AT_COMMIT"/> > >query=' SELECT * FROM Entities' >deltaImportQuery='SELECT "ent_id" AS "id" FROM > "Entities" WHERE "ent_id"=${dataimporter.delta.id}"' > deltaQuery=' SELECT "ent_id" AS "id" FROM "Entities" WHERE > "ent_lastModified" > '${dataimporter.last_index_time}'' >> > > > > > Full-import works fine, but when I run a delta-import the > "ent_lastModified" field, I get the corresponding records, but the > "ent_lastModified" stays the same, so if I make another delta-import, > the same records are retreived. > > I have read all the documentation at > http://wiki.apache.org/solr/DataImportHandler but I could not find an > update query for the "last_modified" field and Solr does not seem to > do this automatically. > I have also tried to name the field "last_modified" as in the example, > but its value keeps unchanged after a delta-import. > > Can anyone point me in the right direction? > > Thanks in advance! > Juan M. >
Re: Updating last_modified field when using DIH
Juan, that's correct .. solr will not touch your database, that's part of your application-code. solr uses an updated timestamp (which is available through dataimporter.last_index_time). so, image the following situation, solr import runs every 10 minutes .. last run at 11:00, your entity gets updated at 11:03, next solr-run at 11:10 will detect this as changed, import the entity and run again at 11:20 .. then, no entity will match the delta-query because solr will ask for a modification_date > 11:10 (last solr-run at this time). you'll only need to update the last_modified field (in your application) when the entity is changed and you want solr to (re-)index your data. HTH, Stefan On Tue, Nov 2, 2010 at 7:35 PM, Juan Manuel Alvarez wrote: > Hello everyone! > > I would like to ask you a question about DIH and delta import. > > I am trying to sync Solr with a PostgreSQL database and I have a field > "ent_lastModified" of type "timestamp without timezone". > > Here is my xml file: > > > url="jdbc:postgresql://host" user="XXX" password="XXX" readOnly="true" > autoCommit="false" >transactionIsolation="TRANSACTION_READ_COMMITTED" > holdability="CLOSE_CURSORS_AT_COMMIT"/> > >query=' SELECT * FROM Entities' >deltaImportQuery='SELECT "ent_id" AS "id" FROM > "Entities" WHERE "ent_id"=${dataimporter.delta.id}"' > deltaQuery=' SELECT "ent_id" AS "id" FROM "Entities" WHERE > "ent_lastModified" > '${dataimporter.last_index_time}'' >> > > > > > Full-import works fine, but when I run a delta-import the > "ent_lastModified" field, I get the corresponding records, but the > "ent_lastModified" stays the same, so if I make another delta-import, > the same records are retreived. > > I have read all the documentation at > http://wiki.apache.org/solr/DataImportHandler but I could not find an > update query for the "last_modified" field and Solr does not seem to > do this automatically. > I have also tried to name the field "last_modified" as in the example, > but its value keeps unchanged after a delta-import. > > Can anyone point me in the right direction? > > Thanks in advance! > Juan M. >
Updating last_modified field when using DIH
Hello everyone! I would like to ask you a question about DIH and delta import. I am trying to sync Solr with a PostgreSQL database and I have a field "ent_lastModified" of type "timestamp without timezone". Here is my xml file: Full-import works fine, but when I run a delta-import the "ent_lastModified" field, I get the corresponding records, but the "ent_lastModified" stays the same, so if I make another delta-import, the same records are retreived. I have read all the documentation at http://wiki.apache.org/solr/DataImportHandler but I could not find an update query for the "last_modified" field and Solr does not seem to do this automatically. I have also tried to name the field "last_modified" as in the example, but its value keeps unchanged after a delta-import. Can anyone point me in the right direction? Thanks in advance! Juan M.