[jira] [Commented] (JAMES-4042) Task manager: issues with long tasks

Karsten Otto (Jira) Mon, 03 Jun 2024 14:14:04 -0700


    [ 
https://issues.apache.org/jira/browse/JAMES-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851815#comment-17851815
 ]


Karsten Otto commented on JAMES-4042:
-------------------------------------

Yes, it should behave exactly like the regular Task regarding the Task API, 
except it runs on a different scheduler and continues the parent flux 
immediately instead of "blocking" it. IIRC I chose the type-based solution for 
its minimal impact on the overall code base, but you could make this a 
different task constructor or whatever, you just need to distinguish the two 
when you hand them to executeTask(). I think I did not want an extra 
executeAsyncTask() because it was more difficult to integrate that with the 
generic task queue, or something along that line of thought.

The main issue with an AsyncSafeTask is that it needs to be reasonably "safe" 
(hence the name) to execute in parallel with other tasks, or even with multiple 
instances of itself. That is, if you want to access some shared resource with a 
non-idempotent operation, you must be prepared for concurrent effects 
_everywhere_ that resource is used. I.e. you might need to introduce 
error-prone locking all over the code base, which I would like to avoid. This 
was not needed before since the task queue effectively serialized task 
execution. AsyncSafeTask works for something like the ExpireMailboxTask since, 
once a mail is deleted, it is gone and deleting it again has the same=no 
effect, so it is "safe". I guess it might be difficult to determine "safe"-ness 
for other more complicated tasks, and running them asynchronously might 
introduce subtle and hard to diagnose concurrency bugs.

> Task manager: issues with long tasks
> ------------------------------------
>
>                 Key: JAMES-4042
>                 URL: https://issues.apache.org/jira/browse/JAMES-4042
>             Project: James Server
>          Issue Type: Improvement
>          Components: rabbitmq, task
>    Affects Versions: master, 3.8.2
>            Reporter: Benoit Tellier
>            Priority: Major
>
> See JAMES-3955
> Today a tasks per default obey a 1 day consumer timeout (configurable!)
> Yet, we might have some tasks like deduplication that might take longer than 
> a day to complete...
> As of today tihis means such a task would complete but very likely crash the 
> consumer.
> Likely, getting such long jobs in the first place might not be the best idea.
> Things like deduplication could be packaged in separate binaries that one 
> could run independently of James thus avoiding the issue altogether... We 
> could allow this as an alternative way to run such long tasks.
> Moreover, I think if needed we could write a Proof Of Concept of running the 
> GC on top of something like Apache Spark to leverage parallele computations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

[jira] [Commented] (JAMES-4042) Task manager: issues with long tasks

Reply via email to