Hi all, I'm trying to guess understand what is the lifecycle of a map function in spark/yarn context. My understanding is that function is instantiated on the master and then passed to each executor (serialized/deserialized).
What I'd like to confirm is that the function is initialized/loaded/deserialized once per executor (JVM in yarn) and lives as long as executor lives and not once per task (logical unit of work to do). Could you please explain or, better, give some links to source code or documentation? I've tried to take a look in Task.scala and ResultTask.scala but I'm not familiar with Scala and didn't find where exactly is function lifecycle managed. Thanks in advance, Vadim.