Re: [RT] A Batch jobs API to complement the existing Jobs API

2016-04-20 Thread Stefan Egli
Hi,

On 19/04/16 14:11, "Bertrand Delacretaz"  wrote:

>Ian mentioned to me that some queuing systems can play this role by
>having a two-phase "take message from queue" mechanism IIUC, where a
>job executor has to confirm success before a message is considered
>consume, that could also play this role.

Yes, that's a standard JMS feature: you can either opt to use JMS
transactions or the client-acknowledge mode (the latter being sort of a
simplified 'receive transaction').

Cheers,
Stefan




Re: [RT] A Batch jobs API to complement the existing Jobs API

2016-04-19 Thread Bertrand Delacretaz
Hi,

On Mon, Apr 18, 2016 at 4:20 PM, Timothée Maret
 wrote:
> ...One aspect that may differentiate asset processing from general job use
> case is the guarantees offered from the job delivery. Some use case would
> be fine with "at-least-one" or "maybe" delivery whereas some use case will
> need "exactly-one" delivery

It's not only about delivering the job submission message, in some
cases it's the whole job execution that needs to be exactly once, for
example.

Designing (or configuring if using the SLING-5646 stuff) a batch job
engine with relatively relaxed requirements about message delivery
(like at least once for the job submission message) might make it much
simpler to implement it in a scalable way.

Of course there are some cases where exactly once *execution* of batch
jobs is required, but then the whole execution chain should be
considered, maybe using a distinct distributed consensus service to
coordinate the whole execution chain.

Ian mentioned to me that some queuing systems can play this role by
having a two-phase "take message from queue" mechanism IIUC, where a
job executor has to confirm success before a message is considered
consume, that could also play this role.

My point is that in some cases the whole job execution chain has to be
considered for exactly once or at least once semantics.

-Bertrand


Re: [RT] A Batch jobs API to complement the existing Jobs API

2016-04-19 Thread Bertrand Delacretaz
Hi,

On Mon, Apr 18, 2016 at 4:04 PM, Ian Boston  wrote:
> ...I am not certain how a Callable will work with a distributed
> implementation,...

Ok, what shape a batch job takes is not very important at this stage,
I get your point.

...
>> void registerBatchEventListener(BatchEventsListener bleh, JobId
>> ... restrictToSpecificJobIds);
>>  }
>
> IIUC this is a bit on an Anti pattern with OSGi. AFAIK the Whiteboard
> pattern is prefered...

Makes sense. And it makes the API even simpler.

> ...Is there a reason that you think the implementation under SLING-5646 won't
> support the batch use case ?...

I suppose it would work, like this:
1. To submit a job, send it to a Queue that's appropriately configured
2. Subscribe to a Topic to receive events about the job's execution
3. All good if a DONE message is received on that topic
4. If an ERROR message is received, or if timeout, act accordingly

That certainly works but requires some non obvious conventions, as
opposed to an API like the one I suggested which is narrower and more
self-explaining.

But maybe we can leave those kinds of details to users of a job
execution system based on SLING-5646.

-Bertrand


Re: [RT] A Batch jobs API to complement the existing Jobs API

2016-04-18 Thread Ian Boston
Hi,

On 18 April 2016 at 15:20, Timothée Maret  wrote:

> Hi,
>
>
> > Is there a reason that you think the implementation under SLING-5646
> won't
> > support the batch use case ?
> >
> >
> >
> One aspect that may differentiate asset processing from general job use
> case is the guarantees offered from the job delivery. Some use case would
> be fine with "at-least-one" or "maybe" delivery whereas some use case will
> need "exactly-one" delivery.
>
> Ideally, each use case could select the level of guarantee it needs
> depending on the use case.
>
> @Ian the MoM API at SLING-5646 and its existing activemq impl at  does not
> seems to define clearly the delivery. Does that mean the jobs are meant to
> be delivered with "maybe" guarantees (and have the application layer handle
> the ack+re-send) ?
>

The MoM API leaves that aspect to configuration of the MoM provider on a
Queue by Queue basis.

For the Queue's used by the SLING-5646, Jobs API the style of delivery is
still abstract. If you wanted to submit a Job that was guaranteed to be
processed once and once only, then you would send it to a queue configured
to deliver once and once only. If you wanted to submit a job with "maybee"
or "at-least-once" semantics, then you would configure the queue
appropriately.

There is nothing in the SLING-5646 Jobs implementation that dictates which
style of queue has to be used, although there is an expectation that for
standard Jobs processing a once and once only queue would be used.

The MoM API could be used by things other than Jobs which is why it became
an API in the PoC.

The default implementation of the MoM API in the PoC is based on ActiveMQ.
IIUC it is possible to configure queues via configuration, although I have
not verified that. It it is not, some code in the PoC MoM API impl bundle
might need to lookup what type of queue is required, and select a different
code path.

Best Regards
Ian


>
> Regards,
>
> Timothee
>


Re: [RT] A Batch jobs API to complement the existing Jobs API

2016-04-18 Thread Timothée Maret
Hi,


> Is there a reason that you think the implementation under SLING-5646 won't
> support the batch use case ?
>
>
>
One aspect that may differentiate asset processing from general job use
case is the guarantees offered from the job delivery. Some use case would
be fine with "at-least-one" or "maybe" delivery whereas some use case will
need "exactly-one" delivery.

Ideally, each use case could select the level of guarantee it needs
depending on the use case.

@Ian the MoM API at SLING-5646 and its existing activemq impl at  does not
seems to define clearly the delivery. Does that mean the jobs are meant to
be delivered with "maybe" guarantees (and have the application layer handle
the ack+re-send) ?

Regards,

Timothee


Re: [RT] A Batch jobs API to complement the existing Jobs API

2016-04-18 Thread Ian Boston
Hi,

On 18 April 2016 at 14:51, Bertrand Delacretaz 
wrote:

> Hi,
>
> I chatted with Ian last week about making our jobs engine more
> scalable (based on his SLING-5646 work) and I think a dead simple API
> for batch jobs might be useful, alongside our existing
> org.apache.sling.event Jobs API.
>
> Our jobs API provides fine-grained synchronous control on the jobs
> that it executes, like stopping jobs, querying the engine for job
> states etc.
>
> That's useful for small jobs and page approval workflows, but it's
> hard to implement in a scalable and distributed way.
>
> Heavy jobs like digital asset processing, for example, do not need
> such fine grained control.
>
> To execute such jobs in a scalable way, a "fire and almost forget"
> scenario should work fine: submit a job to process an asset, subscribe
> to an events stream about its status, check the latest status received
> after a time T and it's not done consider it failed submit it again.
> Make the jobs themselves idempotent for robustness if needed.
>
> I think this would be useful alongside our existing jobs engine, with
> an API that can be as simple as this:
>
>   interface BatchEngine {
> /* Options can be relative priority, preferences for which node
> executes the job, etc. */
> JobId submit(Callable job, Map options);
>   }
>

I am not certain how a Callable will work with a distributed
implementation, as the Callable implies that an Object implementing
Callable is provided in the submit essentially binding the Job to the JVM
where it was submitted.



>
>   interface BatchEventsSource {
> /** If restrictToSpecificJobIds is not null the last known state
> of these jobs is resent, if available */
> void registerBatchEventListener(BatchEventsListener bleh, JobId
> ... restrictToSpecificJobIds);
>  }
>

IIUC this is a bit on an Anti pattern with OSGi. AFAIK the Whiteboard
pattern is prefered. I recently refactored the new Jobs implementation
(SLING-5646) to eliminate listener registration in favour of whiteboard
style registration.



>
>   interface BatchEventsListener {
> void onEvent(BatchEvent beh);
>   }
>
>   class BatchEvent {
> JobId getJobId();
> JobStatus getStatus();
> String getInfo();
>   }
>
> WDYT?
>


Is there a reason that you think the implementation under SLING-5646 won't
support the batch use case ?


Best Regards
Ian



>
> We might also use an existing API if there's a good one, but I think
> we don't need more than the above.
>
> -Bertrand
>


[RT] A Batch jobs API to complement the existing Jobs API

2016-04-18 Thread Bertrand Delacretaz
Hi,

I chatted with Ian last week about making our jobs engine more
scalable (based on his SLING-5646 work) and I think a dead simple API
for batch jobs might be useful, alongside our existing
org.apache.sling.event Jobs API.

Our jobs API provides fine-grained synchronous control on the jobs
that it executes, like stopping jobs, querying the engine for job
states etc.

That's useful for small jobs and page approval workflows, but it's
hard to implement in a scalable and distributed way.

Heavy jobs like digital asset processing, for example, do not need
such fine grained control.

To execute such jobs in a scalable way, a "fire and almost forget"
scenario should work fine: submit a job to process an asset, subscribe
to an events stream about its status, check the latest status received
after a time T and it's not done consider it failed submit it again.
Make the jobs themselves idempotent for robustness if needed.

I think this would be useful alongside our existing jobs engine, with
an API that can be as simple as this:

  interface BatchEngine {
/* Options can be relative priority, preferences for which node
executes the job, etc. */
JobId submit(Callable job, Map options);
  }

  interface BatchEventsSource {
/** If restrictToSpecificJobIds is not null the last known state
of these jobs is resent, if available */
void registerBatchEventListener(BatchEventsListener bleh, JobId
... restrictToSpecificJobIds);
 }

  interface BatchEventsListener {
void onEvent(BatchEvent beh);
  }

  class BatchEvent {
JobId getJobId();
JobStatus getStatus();
String getInfo();
  }

WDYT?

We might also use an existing API if there's a good one, but I think
we don't need more than the above.

-Bertrand