The SolrEntityProcessor would be a top-level entity. You would do a query like this: &sort=timestamp,desc&rows=1&fl=timestamp. This gives you one data item: the timestamp of the last item added to the index.
With this, the JDBC sub-entity would create a query that chooses all rows with a timestamp >= this latest timestamp. It will not be easy to put this together, but it is possible :) Good luck! Lance On Mon, Jan 24, 2011 at 2:04 AM, btucker <btuc...@mintel.com> wrote: > > Thank you for your response. > > In what way is 'timestamp' not perfect? > > I've looked into the SolrEntityProcessor and added a timestamp field to our > index. > However i'm struggling to work out a query to get the max value od the > timestamp field > and does the SolrEntityProcessor entity appear before the root entity or > does it wrap around the root entity. > > On 22 January 2011 07:24, Lance Norskog-2 [via Lucene] < > ml-node+2307215-627680969-326...@n3.nabble.com<ml-node%2b2307215-627680969-326...@n3.nabble.com> >> wrote: > >> The timestamp thing is not perfect. You can instead do a search >> against Solr and find the latest timestamp in the index. SOLR-1499 >> allows you to search against Solr in the DataImportHandler. >> >> On Fri, Jan 21, 2011 at 2:27 AM, btucker <[hidden >> email]<http://user/SendEmail.jtp?type=node&node=2307215&i=0>> >> wrote: >> >> > >> > Hello >> > >> > We've just started using solr to provide search functionality for our >> > application with the DataImportHandler performing a delta-import every 1 >> > fired by crontab, which works great, however it does occasionally miss >> > records that are added to the database while the delta-import is running. >> >> > >> > Our data-config.xml has the following queries in its root entity: >> > >> > query="SELECT id, date_published, date_created, publish_flag FROM Item >> WHERE >> > id > 0 >> > >> > AND record_type_id=0 >> > >> > ORDER BY id DESC" >> > preImportDeleteQuery="SELECT item_id AS Id FROM >> > gnpd_production.item_deletions" >> > deletedPkQuery="SELECT item_id AS id FROM gnpd_production.item_deletions >> > WHERE deletion_date >= >> > >> > SUBDATE('${dataimporter.last_index_time}', INTERVAL 5 MINUTE)" >> > deltaImportQuery="SELECT id, date_published, date_created, publish_flag >> FROM >> > Item WHERE id > 0 >> > >> > AND record_type_id=0 >> > >> > AND id=${dataimporter.delta.id} >> > >> > ORDER BY id DESC" >> > deltaQuery="SELECT id, date_published, date_created, publish_flag FROM >> Item >> > WHERE id > 0 >> > >> > AND record_type_id=0 >> > >> > AND sys_time_stamp >= >> > >> > SUBDATE('${dataimporter.last_index_time}', INTERVAL 1 MINUTE) ORDER BY id >> >> > DESC"> >> > >> > I think the problem i'm having comes from the way solr stores the >> > last_index_time in conf/dataimport.properties as stated on the wiki as >> > >> > ""When delta-import command is executed, it reads the start time stored >> in >> > conf/dataimport.properties. It uses that timestamp to run delta queries >> and >> > after completion, updates the timestamp in conf/dataimport.properties."" >> > >> > Which to me seems to indicate that any records with a time-stamp between >> > when the dataimport starts and ends will be missed as the last_index_time >> is >> > set to when it completes the import. >> > >> > This doesn't seem quite right to me. I would have expected the >> > last_index_time to refer to when the dataimport was last STARTED so that >> > there was no gaps in the timestamp covered. >> > >> > I changed the deltaQuery of our config to include the SUBDATE by INTERVAL >> 1 >> > MINUTE statement to alleviate this problem, but it does only cover times >> > when the delta-import takes less than a minute. >> > >> > Any ideas as to how this can be overcome? ,other than increasing the >> > INTERVAL to something larger. >> > >> > Regards >> > >> > Barry Tucker >> > -- >> > View this message in context: >> http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2300877.html<http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2300877.html?by-user=t> >> > Sent from the Solr - User mailing list archive at Nabble.com. >> > >> >> >> >> -- >> Lance Norskog >> [hidden email] <http://user/SendEmail.jtp?type=node&node=2307215&i=1> >> >> >> ------------------------------ >> If you reply to this email, your message will be added to the discussion >> below: >> >> http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2307215.html >> To unsubscribe from Delta Import occasionally missing records., click >> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=2300877&code=YnR1Y2tlckBtaW50ZWwuY29tfDIzMDA4Nzd8LTEzMDE5MDUxOTI=>. >> >> > > <font size="1" face="Verdana"> > > Mintel International Group Ltd | 18-19 Long Lane | London EC1A 9PL UK > Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 > > Contact details for our other offices can be found at > http://www.mintel.com/office-locations. > > This email and any attachments may include content that is confidential, > privileged, or otherwise protected > under applicable law. Unauthorised disclosure, copying, distribution, or use > of the contents is prohibited > and may be unlawful. If you have received this email in error, including > without appropriate authorisation, > then please reply to the sender about the error and delete this email and any > attachments.</font> > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2318572.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Lance Norskog goks...@gmail.com