Currently Spark sets current time in Milliseconds as the app Id. Is there a
way one can pass in the app id to the spark job, so that it uses this
provided app id instead of generating one using time?

Lets take the following scenario : I have a system application which
schedules spark jobs, and records the metadata for that job (say job
params, cores, etc). In this system application, I want to link every job
with its corresponding UI (history server). The only way I can do this is
if I have the app Id of that job stored in this system application. And the
only way one can get the app Id is by using the
SparkContext.getApplicationId() function - which needs to be run from
inside the job. So, this make it difficult to convey this piece of
information from spark to a system outside spark.

Thanks,
Amit Shanker

Reply via email to