Re: [RT] A Batch jobs API to complement the existing Jobs API
Hi, On 19/04/16 14:11, "Bertrand Delacretaz" wrote: >Ian mentioned to me that some queuing systems can play this role by >having a two-phase "take message from queue" mechanism IIUC, where a >job executor has to confirm success before a message is considered >consume, that could also play this role. Yes, that's a standard JMS feature: you can either opt to use JMS transactions or the client-acknowledge mode (the latter being sort of a simplified 'receive transaction'). Cheers, Stefan
Re: [RT] A Batch jobs API to complement the existing Jobs API
Hi, On Mon, Apr 18, 2016 at 4:20 PM, Timothée Maret wrote: > ...One aspect that may differentiate asset processing from general job use > case is the guarantees offered from the job delivery. Some use case would > be fine with "at-least-one" or "maybe" delivery whereas some use case will > need "exactly-one" delivery It's not only about delivering the job submission message, in some cases it's the whole job execution that needs to be exactly once, for example. Designing (or configuring if using the SLING-5646 stuff) a batch job engine with relatively relaxed requirements about message delivery (like at least once for the job submission message) might make it much simpler to implement it in a scalable way. Of course there are some cases where exactly once *execution* of batch jobs is required, but then the whole execution chain should be considered, maybe using a distinct distributed consensus service to coordinate the whole execution chain. Ian mentioned to me that some queuing systems can play this role by having a two-phase "take message from queue" mechanism IIUC, where a job executor has to confirm success before a message is considered consume, that could also play this role. My point is that in some cases the whole job execution chain has to be considered for exactly once or at least once semantics. -Bertrand
Re: [RT] A Batch jobs API to complement the existing Jobs API
Hi, On Mon, Apr 18, 2016 at 4:04 PM, Ian Boston wrote: > ...I am not certain how a Callable will work with a distributed > implementation,... Ok, what shape a batch job takes is not very important at this stage, I get your point. ... >> void registerBatchEventListener(BatchEventsListener bleh, JobId >> ... restrictToSpecificJobIds); >> } > > IIUC this is a bit on an Anti pattern with OSGi. AFAIK the Whiteboard > pattern is prefered... Makes sense. And it makes the API even simpler. > ...Is there a reason that you think the implementation under SLING-5646 won't > support the batch use case ?... I suppose it would work, like this: 1. To submit a job, send it to a Queue that's appropriately configured 2. Subscribe to a Topic to receive events about the job's execution 3. All good if a DONE message is received on that topic 4. If an ERROR message is received, or if timeout, act accordingly That certainly works but requires some non obvious conventions, as opposed to an API like the one I suggested which is narrower and more self-explaining. But maybe we can leave those kinds of details to users of a job execution system based on SLING-5646. -Bertrand
Re: [RT] A Batch jobs API to complement the existing Jobs API
Hi, On 18 April 2016 at 15:20, Timothée Maret wrote: > Hi, > > > > Is there a reason that you think the implementation under SLING-5646 > won't > > support the batch use case ? > > > > > > > One aspect that may differentiate asset processing from general job use > case is the guarantees offered from the job delivery. Some use case would > be fine with "at-least-one" or "maybe" delivery whereas some use case will > need "exactly-one" delivery. > > Ideally, each use case could select the level of guarantee it needs > depending on the use case. > > @Ian the MoM API at SLING-5646 and its existing activemq impl at does not > seems to define clearly the delivery. Does that mean the jobs are meant to > be delivered with "maybe" guarantees (and have the application layer handle > the ack+re-send) ? > The MoM API leaves that aspect to configuration of the MoM provider on a Queue by Queue basis. For the Queue's used by the SLING-5646, Jobs API the style of delivery is still abstract. If you wanted to submit a Job that was guaranteed to be processed once and once only, then you would send it to a queue configured to deliver once and once only. If you wanted to submit a job with "maybee" or "at-least-once" semantics, then you would configure the queue appropriately. There is nothing in the SLING-5646 Jobs implementation that dictates which style of queue has to be used, although there is an expectation that for standard Jobs processing a once and once only queue would be used. The MoM API could be used by things other than Jobs which is why it became an API in the PoC. The default implementation of the MoM API in the PoC is based on ActiveMQ. IIUC it is possible to configure queues via configuration, although I have not verified that. It it is not, some code in the PoC MoM API impl bundle might need to lookup what type of queue is required, and select a different code path. Best Regards Ian > > Regards, > > Timothee >
Re: [RT] A Batch jobs API to complement the existing Jobs API
Hi, > Is there a reason that you think the implementation under SLING-5646 won't > support the batch use case ? > > > One aspect that may differentiate asset processing from general job use case is the guarantees offered from the job delivery. Some use case would be fine with "at-least-one" or "maybe" delivery whereas some use case will need "exactly-one" delivery. Ideally, each use case could select the level of guarantee it needs depending on the use case. @Ian the MoM API at SLING-5646 and its existing activemq impl at does not seems to define clearly the delivery. Does that mean the jobs are meant to be delivered with "maybe" guarantees (and have the application layer handle the ack+re-send) ? Regards, Timothee
Re: [RT] A Batch jobs API to complement the existing Jobs API
Hi, On 18 April 2016 at 14:51, Bertrand Delacretaz wrote: > Hi, > > I chatted with Ian last week about making our jobs engine more > scalable (based on his SLING-5646 work) and I think a dead simple API > for batch jobs might be useful, alongside our existing > org.apache.sling.event Jobs API. > > Our jobs API provides fine-grained synchronous control on the jobs > that it executes, like stopping jobs, querying the engine for job > states etc. > > That's useful for small jobs and page approval workflows, but it's > hard to implement in a scalable and distributed way. > > Heavy jobs like digital asset processing, for example, do not need > such fine grained control. > > To execute such jobs in a scalable way, a "fire and almost forget" > scenario should work fine: submit a job to process an asset, subscribe > to an events stream about its status, check the latest status received > after a time T and it's not done consider it failed submit it again. > Make the jobs themselves idempotent for robustness if needed. > > I think this would be useful alongside our existing jobs engine, with > an API that can be as simple as this: > > interface BatchEngine { > /* Options can be relative priority, preferences for which node > executes the job, etc. */ > JobId submit(Callable job, Map options); > } > I am not certain how a Callable will work with a distributed implementation, as the Callable implies that an Object implementing Callable is provided in the submit essentially binding the Job to the JVM where it was submitted. > > interface BatchEventsSource { > /** If restrictToSpecificJobIds is not null the last known state > of these jobs is resent, if available */ > void registerBatchEventListener(BatchEventsListener bleh, JobId > ... restrictToSpecificJobIds); > } > IIUC this is a bit on an Anti pattern with OSGi. AFAIK the Whiteboard pattern is prefered. I recently refactored the new Jobs implementation (SLING-5646) to eliminate listener registration in favour of whiteboard style registration. > > interface BatchEventsListener { > void onEvent(BatchEvent beh); > } > > class BatchEvent { > JobId getJobId(); > JobStatus getStatus(); > String getInfo(); > } > > WDYT? > Is there a reason that you think the implementation under SLING-5646 won't support the batch use case ? Best Regards Ian > > We might also use an existing API if there's a good one, but I think > we don't need more than the above. > > -Bertrand >
[RT] A Batch jobs API to complement the existing Jobs API
Hi, I chatted with Ian last week about making our jobs engine more scalable (based on his SLING-5646 work) and I think a dead simple API for batch jobs might be useful, alongside our existing org.apache.sling.event Jobs API. Our jobs API provides fine-grained synchronous control on the jobs that it executes, like stopping jobs, querying the engine for job states etc. That's useful for small jobs and page approval workflows, but it's hard to implement in a scalable and distributed way. Heavy jobs like digital asset processing, for example, do not need such fine grained control. To execute such jobs in a scalable way, a "fire and almost forget" scenario should work fine: submit a job to process an asset, subscribe to an events stream about its status, check the latest status received after a time T and it's not done consider it failed submit it again. Make the jobs themselves idempotent for robustness if needed. I think this would be useful alongside our existing jobs engine, with an API that can be as simple as this: interface BatchEngine { /* Options can be relative priority, preferences for which node executes the job, etc. */ JobId submit(Callable job, Map options); } interface BatchEventsSource { /** If restrictToSpecificJobIds is not null the last known state of these jobs is resent, if available */ void registerBatchEventListener(BatchEventsListener bleh, JobId ... restrictToSpecificJobIds); } interface BatchEventsListener { void onEvent(BatchEvent beh); } class BatchEvent { JobId getJobId(); JobStatus getStatus(); String getInfo(); } WDYT? We might also use an existing API if there's a good one, but I think we don't need more than the above. -Bertrand