[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-07-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388072#comment-15388072
 ] 

Sangjin Lee commented on HADOOP-13070:
--

One other aspect that needs to be addressed (that hasn't been spelled out) is 
the resource loading. The POC here doesn't cover the resource loading.

The call patterns for resource loading are bit more varied as there are 3 
distinct entry points:
- {{ClassLoader.getResource()}}
- {{ClassLoader.getResourceAsStream()}}
- {{ClassLoader.getResources()}}

I also find that the existing {{ApplicationClassLoader}} implementation doesn't 
cover {{ClassLoader.getResources()}}. :)

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: HADOOP-13070.poc.01.patch, Test.java, TestDriver.java, 
> classloading-improvements-ideas-v.3.pdf, classloading-improvements-ideas.pdf, 
> classloading-improvements-ideas.v.2.pdf, lib.jar
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-07-11 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371456#comment-15371456
 ] 

Sean Busbey commented on HADOOP-13070:
--

that sounds like a decent first pass sanity check.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: HADOOP-13070.poc.01.patch, Test.java, TestDriver.java, 
> classloading-improvements-ideas-v.3.pdf, classloading-improvements-ideas.pdf, 
> classloading-improvements-ideas.v.2.pdf, lib.jar
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-07-08 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368807#comment-15368807
 ] 

Sangjin Lee commented on HADOOP-13070:
--

Until JDK gives us an API for obtaining the caller class, this might be the 
best hope. One thing we could do might be to check whether the frames we're 
skipping are {{java.lang.Class}} or {{java.lang.ClassLoader}}. However, I'm not 
sure if that's any more reliable than the number of frames we're skipping...

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: HADOOP-13070.poc.01.patch, Test.java, TestDriver.java, 
> classloading-improvements-ideas-v.3.pdf, classloading-improvements-ideas.pdf, 
> classloading-improvements-ideas.v.2.pdf, lib.jar
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-07-08 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368600#comment-15368600
 ] 

Sean Busbey commented on HADOOP-13070:
--

I played with the POC for a while today. +1 on moving forward. I'm curious on 
what, if anything, we can do for runtime sanity checks on the call stack shape.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: HADOOP-13070.poc.01.patch, Test.java, TestDriver.java, 
> classloading-improvements-ideas-v.3.pdf, classloading-improvements-ideas.pdf, 
> classloading-improvements-ideas.v.2.pdf, lib.jar
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-06-13 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328733#comment-15328733
 ] 

Sangjin Lee commented on HADOOP-13070:
--

No worries. Thanks!

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-06-13 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328224#comment-15328224
 ] 

Ravi Prakash commented on HADOOP-13070:
---

I just wanted to clarify that I am still excited by this idea and looking 
forward to the implementation. We can always build in feature flags that enable 
/ disable this and try it to see what works. Thanks for putting in the effort 
Sangjin!

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-06-09 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323708#comment-15323708
 ] 

Sangjin Lee commented on HADOOP-13070:
--

Thanks for the comments [~busbey] and [~ste...@apache.org].

{quote}
Ugh. this gives me all kinds of bad feels, though I think I might agree. I know 
Steve Loughran has strong feelings on the maintenance burden of this kind of 
custom classloader work, so let's ping him early.
{quote}
I completely agree that this is bit of a bitter pill to swallow. But it is also 
rather clear to me that something like that is needed to pull off the stricter 
isolation (not letting user see parent classes). I worked out a quick working 
prototype of the idea recently, and I'll share it soon.

{quote}
If we go down this path, how concerned are we going to be with maintaining 
cross-JVM compatibility (vs falling back to some kind of "no isolation" 
approach)?
{quote}
That is definitely a concern. This is basically latching onto what 
{{ClassLoader.getCallerClassLoader()}} does. It relies on the shape of the call 
stack frame to determine what the (non-JDK) "calling" class is. It binds the 
JDK implementations to keep that shape uniform across different classloading 
patterns. But it is the case that if the JDK ever changes that frame depth we 
would likely have to update ours to match it.

{quote}
custom classloaders tend to cause problems downstream, both maintenance and 
use. That doesn't mean they don't solve some problems: it's just they are dogs 
to work with.
{quote}
Having dabbled a fair amount in classloaders, I am fully aware of the pain that 
can happen for implementors and users of custom classloaders. Again, if JDK's 
classloading wasn't so leaky to begin with (via things like TCCL or else), it 
could have been simpler. The problem with custom classloaders is that if it 
breaks it breaks in a pretty painful way and it is hard to work around it.

That said, the current application classloader in hadoop works quite well in 
large part with relatively few issues (for disclosure our company pretty much 
enables it by default), and whenever an issue arose I was able to fix it in a 
fairly straightforward manner. So I would consider the current "working" level 
reasonably high.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-06-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323198#comment-15323198
 ] 

Steve Loughran commented on HADOOP-13070:
-

correct. I'm not vetoing it. I'm just warning that it's really hard to get right

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-06-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323058#comment-15323058
 ] 

Sean Busbey commented on HADOOP-13070:
--

that sounds like a -0 rather than a -1?

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-06-09 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323042#comment-15323042
 ] 

Steve Loughran commented on HADOOP-13070:
-

custom classloaders tend to cause problems downstream, both maintenance and 
use. That doesn't mean they don't solve some problems: it's just they are dogs 
to work with. Little dogs that seem like cute little puppies, which eventually 
become big ugly beasts you have to take for 3 h walks every day, eats 
everything in the kitchen, uses your bed as a toilet and, due to its habit of 
biting small children, is something you are scared of yourself. 

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-06-09 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15322828#comment-15322828
 ] 

Sean Busbey commented on HADOOP-13070:
--

+1 on removing Configuration's dedicated classloader. That simplification helps 
limit our pain to the ones java folks expect to have in TCCL.

{quote}
We need to explore an op on that can let you determine the calling class and 
only block a user calling class to load a parent class (rule #4). We might be 
able to accomplish this by trying to determine the calling class and its 
classloader from the stack trace. This is something that the JDK’s ClassLoader 
does (via a non‐public JDK‐internal method), and we may be able to implement 
something similar.
{quote}

Ugh. this gives me all kinds of bad feels, though I think I might agree. I know 
[~ste...@apache.org] has strong feelings on the maintenance burden of this kind 
of custom classloader work, so let's ping him early.

If we go down this path, how concerned are we going to be with maintaining 
cross-JVM compatibility (vs falling back to some kind of "no isolation" 
approach)?

If we're at this point, is just shading every 3rd party dependency we use 
easier (barring the usual non-relocatable bits)? That would also prevent 
downstream folks from relying on them without a very clear at-your-own-risk 
step.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-05-27 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305081#comment-15305081
 ] 

Sangjin Lee commented on HADOOP-13070:
--

bq. I sure do wish Java 9 had something that would make it easier but I didn't 
see anything.

There is jigsaw (in java 9), but then there is always jigsaw. :)

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-05-27 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305075#comment-15305075
 ] 

Sangjin Lee commented on HADOOP-13070:
--

Thanks for the comments [~raviprak]! To answer your questions...

{quote}
ApplicationClassLoader seems like its only being used by MR. I grepped in Tez 
and Spark source, and didn't find any instances. Even if we were to do this 
only for MR, it would be incredibly valuable. I feel it would also set a 
precedent / pattern that other frameworks can then leverage.
{quote}
If you meant that the only usage is hadoop itself, I believe that's correct. 
Within hadoop, there are 3 usages today: MR task class isolation, hadoop run 
jar class isolation, and more recently the NM aux service class isolation. 
Since {{ApplicationClassLoader}} is part of the public API, other frameworks 
can use it.

bq. If we were to focus on MR, do you know what are the common problematic 
conflicting dependencies?
Unfortunately there are many to choose from, and quite a few of the well-known 
ones fall into the problem category. Some of the more famous ones include guava 
and jackson to name a couple.

But isolating class spaces has more benefits than simply preventing collisions. 
Since we're afraid of breaking users, hadoop has been very slow/conservative in 
upgrading any libraries it uses. As a result, we're stuck in the stone age for 
many of the libraries we use. Isolation would give hadoop more freedom to 
upgrade its dependencies without worrying about impacting users. That is of 
course provided that the isolation mode becomes the default, which may still be 
some time away.

{quote}
One alternative approach would be to start 2 JVMs for each MR-task: an 
MR-framework JVM and an MR-task JVM. We would do all MR-framework specific work 
in the MR-framework JVM and send raw Map-Reduce input key-value pairs over a 
socket and read output key value pairs over a socket from the MR-task JVM. The 
MR specific code running in the MR-task JVM would then be minimal and only 
needs to read over the socket and call the user code.
{quote}
That is an interesting idea to solve this problem. I still worry about the 
performance implication it has. Also, it still would not eliminate the problem 
entirely. As you pointed out, even in that separate process you still need a 
minimal amount of hadoop code which then pulls in the needed dependencies.


> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-05-27 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305006#comment-15305006
 ] 

Sean Busbey commented on HADOOP-13070:
--

thanks for all the work so far [~sjlee0]! I'm planning to catch up on this work 
over the weekend.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-05-27 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304825#comment-15304825
 ] 

Ravi Prakash commented on HADOOP-13070:
---

Hi Sangjin! Thanks for taking this up. I look forward to all your improvements.

{{ApplicationClassLoader}} seems like its only being used by MR. I grepped in 
Tez and Spark source, and didn't find any instances. Even if we were to do this 
only for MR, it would be incredibly valuable. I feel it would also set a 
precedent / pattern that other frameworks can then leverage.

If we were to focus on MR, do you know what are the common problematic 
conflicting dependencies? One alternative approach would be to start 2 JVMs for 
each MR-task: an MR-framework JVM and an MR-task JVM. We would do all 
MR-framework specific work in the MR-framework JVM and send raw Map-Reduce 
input key-value pairs over a socket and read output key value pairs over a 
socket from the MR-task JVM. The MR specific code running in the MR-task JVM 
would then be minimal and only needs to read over the socket and call the user 
code. I know protobuf (required for serialization / deserialization) is often 
the conflicting library, so it would be no help in that case. (We could still 
shade this minimal set of libraries.. although I personally dislike shading 
a lot). 

I sure do wish Java 9 had something that would make it easier but I didn't see 
anything.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-05-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303494#comment-15303494
 ] 

Sangjin Lee commented on HADOOP-13070:
--

I would greatly appreciate feedback on the proposal or thoughts and suggestions 
in general. Thanks!

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-05-10 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279034#comment-15279034
 ] 

Sangjin Lee commented on HADOOP-13070:
--

Got you. Thanks. I am comfortable with 3.0 being defined as primarily the java 
8 release. Once we have something ready (and along with HADOOP-11656), it would 
be good to get this in afterwards.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-05-10 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278858#comment-15278858
 ] 

Steve Loughran commented on HADOOP-13070:
-


When i said "flipping the maven switch" I meant "switching the build to being 
Java 8+ only"

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
> Attachments: classloading-improvements-ideas-v.3.pdf, 
> classloading-improvements-ideas.pdf, classloading-improvements-ideas.v.2.pdf
>
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-04-29 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264496#comment-15264496
 ] 

Sangjin Lee commented on HADOOP-13070:
--

Will do. Thanks!

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-04-29 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264464#comment-15264464
 ] 

Sean Busbey commented on HADOOP-13070:
--

[~andrew.wang] and I were just earlier in the week discussing getting an 
approach like this in place as a first step to see if it would suffice for 
HADOOP-11656. Super glad to see more folks interested. On the launcher side, 
I'm planning to pick HADOOP-11804 back up in the next few weeks.

Sangjin do let me know if there's anything I can help with on this, wether it's 
reviews or early work testing out approaches.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-04-29 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264412#comment-15264412
 ] 

Sangjin Lee commented on HADOOP-13070:
--

Hi [~ste...@apache.org], I'm not quite sure what you meant by flipping the 
maven switch. Could you kindly elaborate?

This is a companion to HADOOP-11656, but this addresses more of a 
container-like situation (e.g. isolating user code in MR tasks, etc.). There 
are some long-standing improvements we can make that will make everyone's life 
easier but which may break some backward compatibility. I'm thinking of 
changing some APIs and replacing configs, etc.

I'm not sure if this should target Hadoop 3 (I'm not entirely clear where we 
stand with Hadoop 3 at the moment). My hope is it would be in the first release 
where breaking backward compatibility is allowed.

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-04-29 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263774#comment-15263774
 ] 

Steve Loughran commented on HADOOP-13070:
-

Sanglin: is this targeting Hadoop 3? As if so, we should just flip the maven 
switch, say Java 8+ and focus on that

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13070) classloading isolation improvements for cleaner and stricter dependencies

2016-04-28 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263269#comment-15263269
 ] 

Sangjin Lee commented on HADOOP-13070:
--

I had an offline meeting with Sean and Andrew a while back. Filing an issue 
belatedly to make some progress.

The basic premise is that we have the {{ApplicationClassLoader}} that gets us 
fairly close to where we want to be without introducing a whole lot of 
complication with little additional benefit. We should strengthen it and make 
it stricter to close the gap. And as part of the process, we can correct and 
revisit some of the pain points in terms of classpath isolation by making some 
backward-incompatible changes.

I'll post a proposal some time soon. Here are the key ideas that I have at this 
point (in no priority order):
- "fix/deprecate/remove" {{Configuration.setClassLoader()}}: causes a big 
anti-pattern that allows unsafe sharing and overwriting of classloaders
- make {{ApplicationClassLoader}} stricter: completely separate user classpath 
from the system classpath so it doesn't fall back to parent if the user class 
is not found in the user classpath
- update {{ApplicationClassLoader}} to be current with the java 8 
{{ClassLoader}} implementation (e.g. classloading lock, etc.)
- improve the system class override mechanism in {{ApplicationClassLoader}}

There may be more...

> classloading isolation improvements for cleaner and stricter dependencies
> -
>
> Key: HADOOP-13070
> URL: https://issues.apache.org/jira/browse/HADOOP-13070
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Critical
>
> Related to HADOOP-11656, we would like to make a number of improvements in 
> terms of classloading isolation so that user-code can run safely without 
> worrying about dependency collisions with the Hadoop dependencies.
> By the same token, it should raised the quality of the user code and its 
> specified classpath so that users get clear signals if they specify incorrect 
> classpaths.
> This will contain a proposal that will include several improvements some of 
> which may not be backward compatible. As such, it should be targeted to the 
> next major revision of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org