gianm opened a new pull request, #12901:
URL: https://github.com/apache/druid/pull/12901

   It was possible for `manageInternal` to relaunch a task while it was
   being cleaned up, due to a race that happens when `notifyStatus` is
   called to clean up a successful task:
   
   1) In a critical section, `notifyStatus` removes the task from `tasks`.
   2) Outside a critical section, `notifyStatus` calls `taskRunner.shutdown`
      to let the task runner know it can clear out its data structures.
   3) In a critical section, `syncFromStorage` adds the task back to `tasks`,
      because it is still present in metadata storage.
   4) In a critical section, `manageInternalCritical` notices that the task
      is in `tasks` and is not running in the `taskRunner`, so it launches
      it again.
   5) In a (different) critical section, `notifyStatus` updates the metadata
      store to set the task status to SUCCESS.
   6) The task continues running even though it should not be.
   
   The possibility for this race was introduced in #12099, which shrunk
   the critical section in `notifyStatus`. Prior to that patch, a single
   critical section encompassed (1), (2), and (5), so the ordering above
   was not possible.
   
   This patch does the following:
   
   1) Fixes the race by adding a `recentlyCompletedTasks` set that prevents
      the main management loop from doing anything with tasks that are
      currently being cleaned up.
   2) Switches the order of the critical sections in `notifyStatus`, so
      metadata store updates happen first. This is useful in case of
      server failures: it ensures that if the Overlord fails in the midst
      of `notifyStatus`, then completed-task statuses are still available in
      ZK or on MMs for the next Overlord. (Those are cleaned up by
      `taskRunner.shutdown`, which formerly ran first.) This isn't related
      to the race described above, but is fixed opportunistically as part
      of the same patch.
   3) Changes the `tasks` list to a map. Many operations require retrieval
      or removal of individual tasks; those are now O(1) instead of O(N)
      in the number of running tasks.
   4) Changes various log messages to use task ID instead of full task
      payload, to make the logs more readable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to