[ 
https://issues.apache.org/jira/browse/CONNECTORS-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904582#action_12904582
 ] 

Karl Wright commented on CONNECTORS-41:
---------------------------------------

I looked at this in some detail yesterday.  The prime implementation option is 
to add notification methods to IOutputConnector, so that job events get 
reported to the connector when the job is being terminated.  The issue in this 
case is going to be how exactly to handle ServiceInterruption exceptions that 
occur at the time of the notification into the connector.  This is not 
hypothetical because in the Solr case a notification may well fail, or it may 
take a very long time (many minutes).  Usually when there is a possibility of 
extended interaction it argues for an additional state in the database.

It looks like it will not be possible to delay the change of the job status, 
since that takes place in a transaction.  If the notification fails, the job 
could otherwise be left in the "running" state, and a retry would naturally 
occur until the commit succeeded.  But that doesn't look possible given the 
transaction structure.

An alternative (non-notification) method of handling a commit request would 
require the commit to take place as part of the output connector's poll() 
method.  This is a little better to work with because the poll() method will 
naturally retry in any case.  The issue here is that there would be no 
*guarantee* of a commit taking place at all, since it isn't part of the 
connector contract that the connection must continue to exist for any period of 
time, which I think would violate the spirit of this ticket.

If explicit notification takes place, we could just report any error, and 
forget about it, rather than keeping the job alive for a retry.  That, too, 
would mean that a commit was not guaranteed to occur during the job's lifecycle.

The final alternative, which would seemingly work, would involve there being 
two job shutdown states - one prior to notification, and the second after 
notification.  The first state would be entered based on the current shutdown 
logic.  The second state would be entered only after the notification had been 
successful.  Thus, the notification *could* be called more than once, if there 
were errors, or if the crawler were shut down and restarted before the state 
transition was completed.  The extra state would also allow the job's 
pre-notification status to be noted in the crawler ui.

Because of the potential time delay of a commit, it is probably best for the 
first to second shutdown state transition to be handled by a separate thread, 
or family of threads.


> Add hooks to output connectors for receiving event notifications, 
> specifically job start, job end, etc.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-41
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-41
>             Project: Apache Connectors Framework
>          Issue Type: Improvement
>          Components: Framework core
>            Reporter: Karl Wright
>            Priority: Minor
>
> Currently there is no logic that informs an output connection of a job start, 
> end, deletion, or other activity.  While this would seem to have little to do 
> with an output connector, this feature has been requested by Jack Krupansky 
> as a potential way of deciding when to tell Solr to commit documents, rather 
> than leave it up to Solr's configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to