[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304825#comment-15304825
 ] 

Ravi Prakash commented on HADOOP-13070:
---------------------------------------

Hi Sangjin! Thanks for taking this up. I look forward to all your improvements.

{{ApplicationClassLoader}} seems like its only being used by MR. I grepped in 
Tez and Spark source, and didn't find any instances. Even if we were to do this 
only for MR, it would be incredibly valuable. I feel it would also set a 
precedent / pattern that other frameworks can then leverage.

If we were to focus on MR, do you know what are the common problematic 
conflicting dependencies? One alternative approach would be to start 2 JVMs for 
each MR-task: an MR-framework JVM and an MR-task JVM. We would do all 
MR-framework specific work in the MR-framework JVM and send raw Map-Reduce 
input key-value pairs over a socket and read output key value pairs over a 
socket from the MR-task JVM. The MR specific code running in the MR-task JVM 
would then be minimal and only needs to read over the socket and call the user 
code. I know protobuf (required for serialization / deserialization) is often 
the conflicting library, so it would be no help in that case. (We could still 
shade this minimal set of libraries...... although I personally dislike shading 
a lot). 

I sure do wish Java 9 had something that would make it easier but I didn't see 
anything.

> classloading isolation improvements for cleaner and stricter dependencies
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-13070
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13070
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>            Priority: Critical
>         Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to