[
https://issues.apache.org/jira/browse/JAMES-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851815#comment-17851815
]
Karsten Otto commented on JAMES-4042:
-------------------------------------
Yes, it should behave exactly like the regular Task regarding the Task API,
except it runs on a different scheduler and continues the parent flux
immediately instead of "blocking" it. IIRC I chose the type-based solution for
its minimal impact on the overall code base, but you could make this a
different task constructor or whatever, you just need to distinguish the two
when you hand them to executeTask(). I think I did not want an extra
executeAsyncTask() because it was more difficult to integrate that with the
generic task queue, or something along that line of thought.
The main issue with an AsyncSafeTask is that it needs to be reasonably "safe"
(hence the name) to execute in parallel with other tasks, or even with multiple
instances of itself. That is, if you want to access some shared resource with a
non-idempotent operation, you must be prepared for concurrent effects
_everywhere_ that resource is used. I.e. you might need to introduce
error-prone locking all over the code base, which I would like to avoid. This
was not needed before since the task queue effectively serialized task
execution. AsyncSafeTask works for something like the ExpireMailboxTask since,
once a mail is deleted, it is gone and deleting it again has the same=no
effect, so it is "safe". I guess it might be difficult to determine "safe"-ness
for other more complicated tasks, and running them asynchronously might
introduce subtle and hard to diagnose concurrency bugs.
> Task manager: issues with long tasks
> ------------------------------------
>
> Key: JAMES-4042
> URL: https://issues.apache.org/jira/browse/JAMES-4042
> Project: James Server
> Issue Type: Improvement
> Components: rabbitmq, task
> Affects Versions: master, 3.8.2
> Reporter: Benoit Tellier
> Priority: Major
>
> See JAMES-3955
> Today a tasks per default obey a 1 day consumer timeout (configurable!)
> Yet, we might have some tasks like deduplication that might take longer than
> a day to complete...
> As of today tihis means such a task would complete but very likely crash the
> consumer.
> Likely, getting such long jobs in the first place might not be the best idea.
> Things like deduplication could be packaged in separate binaries that one
> could run independently of James thus avoiding the issue altogether... We
> could allow this as an alternative way to run such long tasks.
> Moreover, I think if needed we could write a Proof Of Concept of running the
> GC on top of something like Apache Spark to leverage parallele computations.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]