[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747092#action_12747092 ] Philip Zeyliger commented on MAPREDUCE-476: --- Failing test is "org.apache.hadoop.mapred.TestRecoveryManager.testRestartCount". I think that's failing all-over, not just here. -- Philip > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, > MAPREDUCE-476-v8.patch, MAPREDUCE-476-v9.patch, MAPREDUCE-476.patch, > v6-to-v7.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747089#action_12747089 ] Hadoop QA commented on MAPREDUCE-476: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417497/MAPREDUCE-476-v9.patch against trunk revision 807165. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/console This message is automatically generated. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, > MAPREDUCE-476-v8.patch, MAPREDUCE-476-v9.patch, MAPREDUCE-476.patch, > v6-to-v7.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746824#action_12746824 ] Tom White commented on MAPREDUCE-476: - Sorry Philip, but I've just noticed that the testFileSystemOtherThanDefault() test from TestDistributedCache (introduced in HADOOP-5635) got missed during the move to TestTrackerDistributedCacheManager. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, > MAPREDUCE-476-v8.patch, MAPREDUCE-476.patch, v6-to-v7.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746761#action_12746761 ] Hadoop QA commented on MAPREDUCE-476: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417438/MAPREDUCE-476-v8.patch against trunk revision 807064. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/console This message is automatically generated. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, > MAPREDUCE-476-v8.patch, MAPREDUCE-476.patch, v6-to-v7.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746416#action_12746416 ] Hadoop QA commented on MAPREDUCE-476: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12417333/MAPREDUCE-476-v7.patch against trunk revision 806764. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/console This message is automatically generated. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, > MAPREDUCE-476.patch, v6-to-v7.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744516#action_12744516 ] Hadoop QA commented on MAPREDUCE-476: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12416836/MAPREDUCE-476-20090818.txt against trunk revision 805324. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/console This message is automatically generated. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743257#action_12743257 ] Philip Zeyliger commented on MAPREDUCE-476: --- Vinod, Thanks for updating the patch! Do you have an update to MAPREDUCE-711 that has the package move? I am trying to apply MAPREDUCE-711-20090709-mapreduce.1.txt and MAPREDUCE-476-20090814.1.txt to trunk, and I think there's a mismatch in the new filecache package name. -- Philip > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-v2-vs-v3.patch, > MAPREDUCE-476-v2-vs-v3.try2.patch, MAPREDUCE-476-v2-vs-v4.txt, > MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, MAPREDUCE-476-v3.try2.patch, > MAPREDUCE-476-v4-requires-MR711.patch, MAPREDUCE-476-v5-requires-MR711.patch, > MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739043#action_12739043 ] Philip Zeyliger commented on MAPREDUCE-476: --- Didn't see unused imports in JobClient.java. There are some deprecated imports, but not related to this patch. I've addressed your other two comments (other unused imports and misspelling of private variable). Will upload new patch after a bit more compiling and testing. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739033#action_12739033 ] Philip Zeyliger commented on MAPREDUCE-476: --- bq. > I agree. A few of them are used to manage the Configuration object. (In my mind, we're serializing and de-serializing a set of requirements for the distributed cache into the text configuration, and doing so a bit haphazardly.) I was very tempted to remove all the ones that are only meant to be internal, but Tom advised me that I need to keep them deprecated for a version. Again, I think moving those methods into a more private place is a good task to do along with changing how JobClient calls into this stuff. bq. +1. So are you planning to do in the next version or in this patch itself? Next version. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738907#action_12738907 ] Vinod K V commented on MAPREDUCE-476: - The changes look good overall, barring - a few unused imports in JobClient.java, TaskRunner.java and TaskDistributedCacheManager (may or may not have introduced by your patch) - misspelt taskDstributedCacheManager in LocalJobRunner > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738813#action_12738813 ] Vinod K V commented on MAPREDUCE-476: - Looking at your latest patch now. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, > MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738811#action_12738811 ] Vinod K V commented on MAPREDUCE-476: - Replies in-line. bq. Is there anything blocking MAPREDUCE-711 that prevents it from being committed? No, I've asked Owen to commit it. bq. I had a very clever bug in there(caused by not thinking enough while resolving a merge conflict) that deleted the current working directory, recursively, in one of the tests. (TaskRunner is hard-coded to delete current working directory, which is ok, since it's typically a child process; not ok for LocalJobRunner.) Nice catch! I too had burnt myself once trying to refactor out the code in Child.java to a thread, and ended up deleting my whole work space because of the same issue :( bq. The tricky bit is always fixing only the lines I've changed, and not all the lines in a given file, to preserve history and keep reviewing sane. That is correct. bq. This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class. bq. Don't think I agree here. We can deprecate the getLocalCache methods in DistributedCache right away. They delegate to each other, and one of them delegates to TrackerDistributedCacheManager. Ideally, I'd remove these altogether — Hadoop internally does not use these methods with this patch, and there's no sensible reason why someone else would, but since it's public, it's getting deprecated. But it's not being deprecated with a pointer to use something else; it's getting deprecated so that you don't use it at all. Okay, i thought those methods were used internally atleast. But if, as you've said, they are not used at all internally, +1 for deprecating them all together. bq. Using .equals method at +150 if (cacheFile.type == CacheFile.FileType.ARCHIVE). If you feel strongly about this, happy to change it, but I think == is more consistent. I am fine, no problem with this. bq. TaskTracker.initialize() A new DistributedCacheManager is created every time, so old files will not be deleted by the subsequent purgeCache. Either we should have only one cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should be static. The same problem exists with the deprecated purgeCache() method in DistributedCache.java bq. If my understanding is correct, in the course of normal operations, TaskTracker.initialize() is only called once, by TaskTracker.run(). At startup time (and whenever the TaskTracker decides to lose all of its state and reset itself, which is essentially the same), the TaskTracker initializes the TrackerDistributedCacheManager object. This is also the only time it's allowed to "clear" the cache, since there are for certain no tasks running at initialization time that depend on those. Ah, I was under the impression that distribute cache files are purged EXPLICITLY when a TT reinitializes, but it looks like we do a full delete on TaskTracker's local files during re-init. So, the code in your patch will work for now, but it won't be in future when we might need explicit purge of distributed cache files when they are owned by users themselves (HADOOP-4493). We will change it then, +1 for the current changes. bq. JobClient.java What happens to the code in here? Should this be refactored too/ are we going to do something about it? bq. Good question I do think that JobClient should be using some methods in the filecache package, instead of hard-coding a lot of logic. That said, I chose to stop somewhere to avoid making this patch even harder to review. I think it's ripe for a future JIRA. Okay. bq. I think most of the getter/setter methods are deprecable and moveable to DistributedCacheManager. Or at least we should give some thought about it. For most of them, I do see from the javadoc that they are only for internal use anyways. bq. I agree. A few of them are used to manage the Configuration object. (In my mind, we're serializing and de-serializing a set of requirements for the distributed cache into the text configuration, and doing so a bit haphazardly.) I was very tempted to remove all the ones that are only meant to be internal, but Tom advised me that I need to keep them deprecated for a version. Again, I think moving those methods into a more private place is a good task to do along with changing how JobClient calls into this stuff. +1. So are you planning to do in the next version or in this patch itself? bq. I don't use Pipes typically, and it doesn't seem to compile on my Mac. I'll try it on a Linux machine, but if it's easy/handy for you, it'd be great to verify that bug. Will do so. bq. TestDistributedCacheManager * Code related to
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737621#action_12737621 ] Philip Zeyliger commented on MAPREDUCE-476: --- Hi Vinod, Thanks for the ping; got distracted by other things. And thanks again for the detailed review. My responses are below. I've generated a patch that shows the differences between v2 and v4, and also the patch, in a state where it still depends on MAPREDUCE-711. is there anything blocking MAPREDUCE-711 that prevents it from being committed? Also, sorry about the multiple uploads here. I had a very clever bug in there (caused by not thinking enough while resolving a merge conflict) that deleted the current working directory, recursively, in one of the tests. (TaskRunner is hard-coded to delete current working directory, which is ok, since it's typically a child process; not ok for LocalJobRunner.) I've run the "relevant" tests; the full tests take a while, so I'm running those in the background. {quote} $for i in TestMRWithDistributedCache TestMiniMRLocalFS TestMiniMRDFSCaching TestTrackerDistributedCacheManager; do; ant test -Dtestcase=$i > test-out-$i && echo "$i good" || echo "$i bad"; done TestMRWithDistributedCache good TestMiniMRLocalFS good TestMiniMRDFSCaching good TestTrackerDistributedCacheManager good {quote} bq. There is quite a bit of refactoring in this patch, though I find it really useful. Yep. Having DistributedCache work locally is easy if you refactor the code a bit, so that's how I went at it. bq. Please make sure that in the newly added code, lines aren't longer than 80 characters. For e.g, see DistributedCacheManager.newTaskHandle() method. A handful of "git diff foo..bar | egrep "^\+\+\+|^\+ .{80}" has done the trick, I think. The tricky bit is always fixing only the lines I've changed, and not all the lines in a given file, to preserve history and keep reviewing sane. bq. Just a thought, can the classes be better renamed to reflect their usage, something like TrackerDistributedCacheManager and TaskDistributeCacheManager? I like those names better; thanks. Changed. bq. DistributedCacheManager and DistributedCacheHandle: Explicitly state in javadoc that it is not a public interface Done. bq. This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class. Don't think I agree here. We can deprecate the getLocalCache methods in DistributedCache right away. They delegate to each other, and one of them delegates to TrackerDistributedCacheManager. Ideally, I'd remove these altogether --- Hadoop internally does not use these methods with this patch, and there's no sensible reason why someone else would, but since it's public, it's getting deprecated. But it's not being deprecated with a pointer to use something else; it's getting deprecated so that you don't use it at all. bq. DistributedCacheHandle CacheFile.makeCacheFiles() bq. isClassPath can be renamed to shouldBePutInClasspath Renamed to shouldBeAddedToClasspath. bq. paths can be renamed to pathsToPutInClasspath. Renamed to pathsToBeAddedToClasspath bq. Use .equals method at +150 if (cacheFile.type == CacheFile.FileType.ARCHIVE) I believe that technically it doesn't matter. The JDK implementation of equals() on java.lang.Enum is final, and hardcoded to "this==other". This is the only thing that makes sense, since there's only ever one instance of a given Enum. I took an inaccurate look at the code base, and == is the more common option. {quote} # Inaccurate! Not a static analysis! Not even close! ;) [1]doorstop:hadoop-mapreduce(140142)$ack "\([a-zA-Z]*\.[a-zA-Z]*\.equals" src | wc -l 11 [0]doorstop:hadoop-mapreduce(140143)$ack "\([a-zA-Z]*\.[a-zA-Z]* ==" src | wc -l 127 {quote} If you feel strongly about this, happy to change it, but I think == is more consistent. bq. makeCacheFiles: boolean isArchive -> FileType fileType done. bq. I think it would be cleaner to return target instead of passing it as an argument. Done. bq. makeCacheFiles() method should be documented Done. bq. setup() This method is really useful, avoids a lot of code duplication! Ok. bq. Leave localTaskFile writing business back in TaskRunner itself. I think It is the task's responsiblity, not the DistributeCacheHandle's Good call; done. bq. cacheSubdir can better be an argument to setup() method instead of passing it to the constructor. Good idea; done. bq. getClassPaths() : Document that it has to be called and useful only when is already invoked. Done. I've made it throw an exception if it's called erroneously, since I could see that causing trouble for developers. bq. TaskTracker.initialize() A new DistributedCacheManager
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737391#action_12737391 ] Philip Zeyliger commented on MAPREDUCE-476: --- Never mind, trying to rush before leaving the office, and the tests fail here. Back tomorrow. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, > MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, MAPREDUCE-476-v3.try2.patch, > MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737385#action_12737385 ] Philip Zeyliger commented on MAPREDUCE-476: --- Vinod, Yes. I've been hacking away at it today. Please ignore those last two updated diffs: while getting rid of some 80+ character lines, I fumbled some git stuff and produced bad patches. I'll be producing good ones after some more sanity checking either late today or tomorrow morning. -- Philip > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, > MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737042#action_12737042 ] Vinod K V commented on MAPREDUCE-476: - Philip, any update on this issue? > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732547#action_12732547 ] Philip Zeyliger commented on MAPREDUCE-476: --- Vinod, Thanks for your comments and thorough review. I'll take a closer look over the next couple of days and post a new patch. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732482#action_12732482 ] Vinod K V commented on MAPREDUCE-476: - Another point. We should consolidate TestMRWithDistributedCache, TestMiniMRLocalFS and TestMiniMRDFSCaching. That's a lot of code in three different files testing mostly common code. May be another JIRA issue if you feel so. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732474#action_12732474 ] Vinod K V commented on MAPREDUCE-476: - Looked at the remaining parts of the patch. Some more comments. DistributedCacheHandle.makeClassLoader - use File.toURI.toURL() instead of File.toURL() directly. File.toURL() is deprecated. LocalJobRunner.java - TaskRunner.setupWorkDir needs to be fixed to have a workDir arg to help the code in LocalJobRunner. Compilation(ant binary) breaks now. - TODO: Test pipes with LocalJobRunner and make sure that it works, i.e. verify HADOOP-5174. TestDistributedCacheManager - Code related to setFileStamps in JobClient.java (+636 to +656) and testManagerFlow() (+71 to +74) can be refactored into an internal-only method in DistributedCacheManager. - Minor: assertNonNull(+90) check can be moved after +91 and use localCacheFiles for the null check TestMRWithDistributedCache - Just documenting a TODO: Will need to fix this class's javadoc once MAPREDUCE-711 is in. - Fix the comment at +84. It is actually two archives. - Please put a comment for the class saying that it is NOT A FAST test, keeping in view the recent efforts to have a fast test target. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731902#action_12731902 ] Vinod K V commented on MAPREDUCE-476: - I am still to look at the changes in LocalJobRunner (the real part of the patch :) ) and the test-cases. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731901#action_12731901 ] Vinod K V commented on MAPREDUCE-476: - Some more comments: TaskTracker.initialize() - A new DistributedCacheManager is created every time, so old files will not be deleted by the subsequent purgeCache. Either we should have only one cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should be static. - The same problem exists with the deprecated purgeCache() method in DistributedCache.java JobClient.java - What happens to the code in here? Should this be refactored too/ are we going to do something about it? DistributedCache.java - getLocalCache() (+190) should be indirected to DistributedCacheManager.getLocalCache(). Otherwise there is a stack overflow error - the method calls itself as of now. - I think most of the getter/setter methods are deprecable and moveable to DistributedCacheManager. Or at least we should give some thought about it. For most of them, I do see from the javadoc that they are only for internal use anyways. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731899#action_12731899 ] Vinod K V commented on MAPREDUCE-476: - I've started looking at your latest patch in combination with an intermediate patch for MAPREDUCE-711. There is quite a bit of refactoring in this patch, though I find it really useful. Just to be sure, I will call this out explicitly to Devaraj/Owen offline. I have some comments in relation to the patch. General: - Please make sure that in the newly added code, lines aren't longer than 80 characters. For e.g, see DistributedCacheManager.newTaskHandle() method. - Just a thought, can the classes be better renamed to reflect their usage, something like TrackerDistributedCacheManager and TaskDistributeCacheManager? DistributedCacheManager - Explicitly state in javadoc that it is not a public interface :) - This class should also have the variable number argument getLocalCache() methods so that the corresponding methods in DistributedCache can be deprecated. Also, each method in DistributedCache should call the correponding method in DistributedCacheManager class. DistributedCacheHandle - Explicitly state in javadoc that it is not a public interface - CacheFile.makeCacheFiles() -- isClassPath can be renamed to shouldBePutInClasspath -- paths can be renamed to pathsToPutInClasspath. -- Use .equals method at +150 {code}if (cacheFile.type == CacheFile.FileType.ARCHIVE){code} --{code}static void makeCacheFiles(URI[] uris, String[] timestamps, Path[] paths, boolean isArchive, List target){code} should instead be {code}static void makeCacheFiles(URI[] uris, String[] timestamps, Path[] paths, FileType fileType, List target){code} -- I think it would be cleaner to return target instead of passing it as an argument. -- makeCacheFiles() method should be documented - setup() -- This method is really useful, avoids a lot of code duplication! -- Leave localTaskFile writing business back in TaskRunner itself. I think It is the task's responsiblity, not the DistributeCacheHandle's -- cacheSubdir can better be an argument to setup() method instead of passing it to the constructor. - getClassPaths() : Document that it has to be called and useful only when is already invoked. > extend DistributedCache to work locally (LocalJobRunner) > > > Key: MAPREDUCE-476 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-476 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: sam rash >Assignee: Philip Zeyliger >Priority: Minor > Attachments: HADOOP-2914-v1-full.patch, > HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, > MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch > > > The DistributedCache does not work locally when using the outlined recipe at > http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html > > Ideally, LocalJobRunner would take care of populating the JobConf and copying > remote files to the local file sytem (http, assume hdfs = default fs = local > fs when doing local development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.