vicennial commented on PR #50334:
URL: https://github.com/apache/spark/pull/50334#issuecomment-2750903763

   Thanks for identifying this issue, @wbo4958! While your PR resolves the 
executor-side problem, I believe we have a chance to refine our approach to 
cover both executor operations (e.g., typical UDFs) and driver operations 
(e.g., custom data sources) in one unified solution.
   
   The high-level proposal: In the 
[ArtifactManager](https://github.com/apache/spark/blob/5db31aec33c53aaa7c814f33ec84e6ba66fc193b/sql/core/src/main/scala/org/apache/spark/sql/artifact/ArtifactManager.scala#L56),
 add an initialisation step that would **copy JARs** from the underlying 
`session.sparkContext.addedJars(DEFAULT_SESSION_ID)` into 
`session.sparkContext.addedJars(session.sessionUUID)`.
   Advantages:
   - Enhanced session isolation
      - Global JARs are copied during initialization, so any subsequent changes 
to the default session jars do not affect the session-specific context.
      - This isolation is particularly beneficial in standalone clusters where 
Spark Connect sessions coexist with traditional sessions (i.e., those 
interacting directly with SparkContext). 
   - Since the copied global JARs behave as session-scoped JARs, no extra 
modifications to the executor’s code or classloader are required.
   
   
   The negative here is duplicating the global JARs for each new Spark Connect 
session will naturally consume more resources. We could mitigate this by adding 
a Spark configuration option to toggle whether global jars are inherited into a 
Spark Connect session.
   
   WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to