[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2016-12-12 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated MAPREDUCE-6128:

Status: Open  (was: Patch Available)

Canceling patch because it does not apply to trunk.

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch, 
> MAPREDUCE-6128.v06.patch, MAPREDUCE-6128.v07.patch, MAPREDUCE-6128.v08.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2016-12-02 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6128:
--
Labels:   (was: BB2015-05-TBR)

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch, 
> MAPREDUCE-6128.v06.patch, MAPREDUCE-6128.v07.patch, MAPREDUCE-6128.v08.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2015-05-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated MAPREDUCE-6128:

Labels: BB2015-05-TBR  (was: )

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
>  Labels: BB2015-05-TBR
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch, 
> MAPREDUCE-6128.v06.patch, MAPREDUCE-6128.v07.patch, MAPREDUCE-6128.v08.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-11-20 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v08.patch

Thanks for review, [~jlowe]!

Good point regarding DO_TESTMRJOBS_HACK. Leaving it this way so far, but 
CLASSPATH setup in TestMRJobs is worth tackling.

Regarding jars vs non-jars, I think my main point it's our most common case, 
and I definitely wanted to exclude directories to avoid some an accidental pick 
up of a large directory. But sure we can consider other plain files.

bq. If the manifest asks for two different jars with the same basename then I 
think it will silently skip the latter entry. Intentional? 

Good observation. It was not truly intentional. I think the last-entry-wins 
policy is more intuitive. Thus changing it and documenting it. Added a test.

bq. Theoretically distributed cache archives could also conflict with the 
manifest, so I'm thinking the manifest should be processed after archives and 
the conflict check should also check the archive list.

Good point, I agree. Added a test for that.

bq. We may want an info (debug?) log message when manifest entries are 
overridden by other distributed cache entries.

done as well.



> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch, 
> MAPREDUCE-6128.v06.patch, MAPREDUCE-6128.v07.patch, MAPREDUCE-6128.v08.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-11-18 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v07.patch

Hi [~jlowe]. Since you committed HADOOP-11309, I removed the workaround of 
enumerating nested classes in v07. 

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch, 
> MAPREDUCE-6128.v06.patch, MAPREDUCE-6128.v07.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-11-15 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v06.patch

Thanks for your suggestions, [~jlowe]. 

I will file another JIRA for nested class handling. I got around the issue in 
this v06 by filtering the contents of the test jar.

The code is refactored to take care of both libjars and preconfigured cache 
jars at the same time. I added a test for this overrides as well.

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch, 
> MAPREDUCE-6128.v06.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-11-11 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v05.patch

v04 accidentally dropped 
{code}
  myConf.setInt(MRJobConfig.MAP_MAX_ATTEMPTS, 2); //reduce the number of 
attempts
{code}

from {{runFailingMapperJob}}. Restoring in v05.

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch, MAPREDUCE-6128.v05.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-11-10 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v04.patch

v04 to get rid of deprecation warnings.

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch, MAPREDUCE-6128.v04.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-11-09 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v03.patch

v03 where only TestMRJobs skips adding test classes to child JVM.

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch, 
> MAPREDUCE-6128.v03.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-11-08 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Status: Patch Available  (was: Open)

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-11-08 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v02.patch

Jason, thank you for review.

Added a test, and It helped uncover classloading defects that were hidden by 
the fact that jobclient-tests jar is always added to task classpath when 
running on minicluster. Along the way I fixed that even a job.jar that is not 
uploaded via copyJar and hence can have a different name is correctly added to 
the classpath.

For easier testing, I introduced another job classloader pattern Klass$ that 
matches Klass and all nested classes of Klass



> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch, MAPREDUCE-6128.v02.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6128) Automatic addition of bundled jars to distributed cache

2014-10-15 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated MAPREDUCE-6128:
-
Attachment: MAPREDUCE-6128.v01.patch

v01 to illustrate the idea.

> Automatic addition of bundled jars to distributed cache 
> 
>
> Key: MAPREDUCE-6128
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6128
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: client
>Affects Versions: 2.5.1
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: MAPREDUCE-6128.v01.patch
>
>
> On the client side, JDK adds Class-Path elements from the job jar manifest
> on the classpath. In theory there could be many bundled jars in many 
> directories such that adding them manually via libjars or similar means to 
> task classpaths is cumbersome. If this property is enabled, the same jars are 
> added
> to the task classpaths automatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)