Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/3486#issuecomment-66395938
  
    It seems there are two separate issues that we are discussing here:
    
    (1) How to propagate the link from the AM to the driver
    (2) How to propagate the link from the driver to the UI
    
    For (1), your current design sets an environment variable on the executor, 
and then the executor passes the link on when it registers with the driver. 
While this approach certainly works, I am not in favor of it because we are 
passing the link to the executor unnecessarily. This involves adding 
highly-specific fields to the `RegisterExecutor`, which currently only conveys 
crucial information for scheduling tasks on the executor (i.e. the ID, address, 
number of cores). The other thing is that Spark in general has been trying to 
move away from passing information through environment variables because we 
need to worry about potential namespace collisions there. Also, the user may 
start setting internal Spark variables themselves even though they're not 
documented. This happens surprisingly often.
    
    For (2), I agree that adding notification for when an executor is 
registered may be useful, but we already have a callback 
(`onBlockManagerAdded`) that essentially serves that purpose. My concern is 
that if we decide to expose executor info differently in the future, we can't 
change these events easily without breaking binary compatibility. This has 
bitten us a few times in the past; in general the maintenance burden of these 
`SparkListener` classes is pretty high since we need to worry about JSON 
compatibility as well in event logs. My high-level point on this is that if 
there is an alternative that allows us to bypass changing what is exposed to 
the user, then we should implement that alternative since we can always revise 
it in the future. The contrary is not true, however.
    
    Does that make sense?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to