The SolrEntityProcessor would be a top-level entity. You would do a
query like this: &sort=timestamp,desc&rows=1&fl=timestamp. This gives
you one data item: the timestamp of the last item added to the index.

With this, the JDBC sub-entity would create a query that chooses all
rows with a timestamp >= this latest timestamp. It will not be easy to
put this together, but it is possible :)

Good luck!

Lance

On Mon, Jan 24, 2011 at 2:04 AM, btucker <btuc...@mintel.com> wrote:
>
> Thank you for your response.
>
> In what way is 'timestamp' not perfect?
>
> I've looked into the SolrEntityProcessor and added a timestamp field to our
> index.
> However i'm struggling to work out a query to get the max value od the
> timestamp field
> and does the SolrEntityProcessor entity appear before the root entity or
> does it wrap around the root entity.
>
> On 22 January 2011 07:24, Lance Norskog-2 [via Lucene] <
> ml-node+2307215-627680969-326...@n3.nabble.com<ml-node%2b2307215-627680969-326...@n3.nabble.com>
>> wrote:
>
>> The timestamp thing is not perfect. You can instead do a search
>> against Solr and find the latest timestamp in the index. SOLR-1499
>> allows you to search against Solr in the DataImportHandler.
>>
>> On Fri, Jan 21, 2011 at 2:27 AM, btucker <[hidden 
>> email]<http://user/SendEmail.jtp?type=node&node=2307215&i=0>>
>> wrote:
>>
>> >
>> > Hello
>> >
>> > We've just started using solr to provide search functionality for our
>> > application with the DataImportHandler performing a delta-import every 1
>> > fired by crontab, which works great, however it does occasionally miss
>> > records that are added to the database while the delta-import is running.
>>
>> >
>> > Our data-config.xml has the following queries in its root entity:
>> >
>> > query="SELECT id, date_published, date_created, publish_flag FROM Item
>> WHERE
>> > id > 0
>> >
>> > AND record_type_id=0
>> >
>> > ORDER BY id DESC"
>> > preImportDeleteQuery="SELECT item_id AS Id FROM
>> > gnpd_production.item_deletions"
>> > deletedPkQuery="SELECT item_id AS id FROM gnpd_production.item_deletions
>> > WHERE deletion_date >=
>> >
>> > SUBDATE('${dataimporter.last_index_time}', INTERVAL 5 MINUTE)"
>> > deltaImportQuery="SELECT id, date_published, date_created, publish_flag
>> FROM
>> > Item WHERE id > 0
>> >
>> > AND record_type_id=0
>> >
>> > AND id=${dataimporter.delta.id}
>> >
>> > ORDER BY id DESC"
>> > deltaQuery="SELECT id, date_published, date_created, publish_flag FROM
>> Item
>> > WHERE id > 0
>> >
>> > AND record_type_id=0
>> >
>> > AND sys_time_stamp >=
>> >
>> > SUBDATE('${dataimporter.last_index_time}', INTERVAL 1 MINUTE) ORDER BY id
>>
>> > DESC">
>> >
>> > I think the problem i'm having comes from the way solr stores the
>> > last_index_time in conf/dataimport.properties as stated on the wiki as
>> >
>> > ""When delta-import command is executed, it reads the start time stored
>> in
>> > conf/dataimport.properties. It uses that timestamp to run delta queries
>> and
>> > after completion, updates the timestamp in conf/dataimport.properties.""
>> >
>> > Which to me seems to indicate that any records with a time-stamp between
>> > when the dataimport starts and ends will be missed as the last_index_time
>> is
>> > set to when it completes the import.
>> >
>> > This doesn't seem quite right to me. I would have expected the
>> > last_index_time to refer to when the dataimport was last STARTED so that
>> > there was no gaps in the timestamp covered.
>> >
>> > I changed the deltaQuery of our config to include the SUBDATE by INTERVAL
>> 1
>> > MINUTE statement to alleviate this problem, but it does only cover times
>> > when the delta-import takes less than a minute.
>> >
>> > Any ideas as to how this can be overcome? ,other than increasing the
>> > INTERVAL to something larger.
>> >
>> > Regards
>> >
>> > Barry Tucker
>> > --
>> > View this message in context:
>> http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2300877.html<http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2300877.html?by-user=t>
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >
>>
>>
>>
>> --
>> Lance Norskog
>> [hidden email] <http://user/SendEmail.jtp?type=node&node=2307215&i=1>
>>
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2307215.html
>>  To unsubscribe from Delta Import occasionally missing records., click
>> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=2300877&code=YnR1Y2tlckBtaW50ZWwuY29tfDIzMDA4Nzd8LTEzMDE5MDUxOTI=>.
>>
>>
>
> <font size="1" face="Verdana">
>
> Mintel International Group Ltd | 18-19 Long Lane | London EC1A 9PL UK
> Registered in England: Number 1475918. | VAT Number: GB 232 9342 72
>
> Contact details for our other offices can be found at 
> http://www.mintel.com/office-locations.
>
> This email and any attachments may include content that is confidential, 
> privileged, or otherwise protected
> under applicable law. Unauthorised disclosure, copying, distribution, or use 
> of the contents is prohibited
> and may be unlawful. If you have received this email in error, including 
> without appropriate authorisation,
> then please reply to the sender about the error and delete this email and any 
> attachments.</font>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Delta-Import-occasionally-missing-records-tp2300877p2318572.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to