Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/11012#issuecomment-178793683 Quoting from the JIRA ticket: > In our particular case, this behavior manifests since the currentTaskAttemptId() method is returning -1 for each Spark receiver task. This in and of itself could be a bug and is something I'm going to look into. I think this is definitely a bug. I believe that the intent here was that we'd return a dummy `taskAttemptId` on the driver but that any code running in a task should have a valid `TaskContext` thread local and thus a valid task attempt id. `TaskContext` isn't an inheritable thread-local, though, so we'll have to explicitly propagate it from the top-level task thread to the receiver threads in order to address this. Even if we did fix the `TaskContext` propagation issue, the fix in this patch would still be necessary because we'd still have to be properly thread-safe in case a multi-threaded receiver was storing blocks. Intuitively, the idea of adding extra synchronization here seems right to me, although I'd like to take a closer look at the changes here to see whether this will introduce performance problems: my guess is that the under-synchronization might have been caused by a desire to avoid holding monitors/locks during expensive operations. @zsxwing, do you know why the streaming longevity / memory leak tests didn't catch this leak?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org