This is timely, since I just ran into this issue myself while trying to write a test to reproduce a bug related to speculative execution (I wanted to configure a job so that the first attempt to compute a partition would run slow so that a second, fast speculative copy would be launched).
I've opened a PR with a proposed fix: https://github.com/apache/spark/pull/3849 On Tue, Dec 30, 2014 at 12:24 PM, Cody Koeninger <c...@koeninger.org> wrote: > It looks like taskContext.attemptId doesn't mean what one thinks it might > mean, based on > > > http://apache-spark-developers-list.1001551.n3.nabble.com/Get-attempt-number-in-a-closure-td8853.html > > and the unresolved > > https://issues.apache.org/jira/browse/SPARK-4014 > > > > Is there any alternative way to tell if compute is being called from a > retry? Barring that, does anyone have any tips on how it might be possible > to get the attempt count propagated to executors? > > It would be extremely useful for the kafka rdd preferred location > awareness. >