[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-24 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747092#action_12747092
 ] 

Philip Zeyliger commented on MAPREDUCE-476:
---

Failing test is 
"org.apache.hadoop.mapred.TestRecoveryManager.testRestartCount".  I think 
that's failing all-over, not just here.

-- Philip

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, 
> MAPREDUCE-476-v8.patch, MAPREDUCE-476-v9.patch, MAPREDUCE-476.patch, 
> v6-to-v7.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747089#action_12747089
 ] 

Hadoop QA commented on MAPREDUCE-476:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12417497/MAPREDUCE-476-v9.patch
  against trunk revision 807165.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 14 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/509/console

This message is automatically generated.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, 
> MAPREDUCE-476-v8.patch, MAPREDUCE-476-v9.patch, MAPREDUCE-476.patch, 
> v6-to-v7.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-24 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746824#action_12746824
 ] 

Tom White commented on MAPREDUCE-476:
-

Sorry Philip, but I've just noticed that the testFileSystemOtherThanDefault() 
test from TestDistributedCache (introduced in HADOOP-5635) got missed during 
the move to TestTrackerDistributedCacheManager.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, 
> MAPREDUCE-476-v8.patch, MAPREDUCE-476.patch, v6-to-v7.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746761#action_12746761
 ] 

Hadoop QA commented on MAPREDUCE-476:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12417438/MAPREDUCE-476-v8.patch
  against trunk revision 807064.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 14 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/507/console

This message is automatically generated.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, 
> MAPREDUCE-476-v8.patch, MAPREDUCE-476.patch, v6-to-v7.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746416#action_12746416
 ] 

Hadoop QA commented on MAPREDUCE-476:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12417333/MAPREDUCE-476-v7.patch
  against trunk revision 806764.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 14 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 3 new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/504/console

This message is automatically generated.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476-v7.patch, 
> MAPREDUCE-476.patch, v6-to-v7.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744516#action_12744516
 ] 

Hadoop QA commented on MAPREDUCE-476:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12416836/MAPREDUCE-476-20090818.txt
  against trunk revision 805324.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 14 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/489/console

This message is automatically generated.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-20090818.txt, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476-v5-requires-MR711.patch, MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-14 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743257#action_12743257
 ] 

Philip Zeyliger commented on MAPREDUCE-476:
---

Vinod,

Thanks for updating the patch!  Do you have an update to MAPREDUCE-711 that has 
the package move?  I am trying to apply MAPREDUCE-711-20090709-mapreduce.1.txt 
and MAPREDUCE-476-20090814.1.txt to trunk, and I think there's a mismatch in 
the new filecache package name.

-- Philip

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-20090814.1.txt, MAPREDUCE-476-v2-vs-v3.patch, 
> MAPREDUCE-476-v2-vs-v3.try2.patch, MAPREDUCE-476-v2-vs-v4.txt, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, MAPREDUCE-476-v3.try2.patch, 
> MAPREDUCE-476-v4-requires-MR711.patch, MAPREDUCE-476-v5-requires-MR711.patch, 
> MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-04 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739043#action_12739043
 ] 

Philip Zeyliger commented on MAPREDUCE-476:
---

Didn't see unused imports in JobClient.java.  There are some deprecated 
imports, but not related to this patch.  I've addressed your other two comments 
(other unused imports and misspelling of private variable).

Will upload new patch after a bit more compiling and testing.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-04 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739033#action_12739033
 ] 

Philip Zeyliger commented on MAPREDUCE-476:
---

bq.   > I agree. A few of them are used to manage the Configuration object. (In 
my mind, we're serializing and de-serializing a set of requirements for the 
distributed cache into the text configuration, and doing so a bit haphazardly.) 
I was very tempted to remove all the ones that are only meant to be internal, 
but Tom advised me that I need to keep them deprecated for a version. Again, I 
think moving those methods into a more private place is a good task to do along 
with changing how JobClient calls into this stuff.
bq. +1. So are you planning to do in the next version or in this patch itself?

Next version.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-04 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738907#action_12738907
 ] 

Vinod K V commented on MAPREDUCE-476:
-

The changes look good overall, barring
 - a few unused imports in JobClient.java, TaskRunner.java and 
TaskDistributedCacheManager (may or may not have introduced by your patch)
 - misspelt taskDstributedCacheManager in LocalJobRunner

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-03 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738813#action_12738813
 ] 

Vinod K V commented on MAPREDUCE-476:
-

Looking at your latest patch now.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2-vs-v4.txt, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476-v3.try2.patch, MAPREDUCE-476-v4-requires-MR711.patch, 
> MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-08-03 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738811#action_12738811
 ] 

Vinod K V commented on MAPREDUCE-476:
-

Replies in-line.

bq. Is there anything blocking MAPREDUCE-711 that prevents it from being 
committed?
No, I've asked Owen to commit it.

bq. I had a very clever bug in there(caused by not thinking enough while 
resolving a merge conflict) that deleted the current working directory, 
recursively, in one of the tests. (TaskRunner is hard-coded to delete current 
working directory, which is ok, since it's typically a child process; not ok 
for LocalJobRunner.)
Nice catch! I too had burnt myself once trying to refactor out the code in 
Child.java to a thread, and ended up deleting my whole work space because of 
the same issue :(

bq. The tricky bit is always fixing only the lines I've changed, and not all 
the lines in a given file, to preserve history and keep reviewing sane.
That is correct.

bq. This class should also have the variable number argument getLocalCache() 
methods so that the corresponding methods in DistributedCache can be 
deprecated. Also, each method in DistributedCache should call the correponding 
method in DistributedCacheManager class.
bq. Don't think I agree here. We can deprecate the getLocalCache methods in 
DistributedCache right away. They delegate to each other, and one of them 
delegates to TrackerDistributedCacheManager. Ideally, I'd remove these 
altogether — Hadoop internally does not use these methods with this patch, and 
there's no sensible reason why someone else would, but since it's public, it's 
getting deprecated. But it's not being deprecated with a pointer to use 
something else; it's getting deprecated so that you don't use it at all.
Okay, i thought those methods were used internally atleast. But if, as you've 
said, they are not used at all internally, +1 for deprecating them all together.

bq. Using  .equals method at +150 if (cacheFile.type == 
CacheFile.FileType.ARCHIVE). If you feel strongly about this, happy to change 
it, but I think == is more consistent.
I am fine, no problem with this.

bq. TaskTracker.initialize() A new DistributedCacheManager is created every 
time, so old files will not be deleted by the subsequent purgeCache. Either we 
should have only one cacheManager for a TaskTracker or 
DistributedCacheManager.cachedArchives should be static. The same problem 
exists with the deprecated purgeCache() method in DistributedCache.java
bq. If my understanding is correct, in the course of normal operations, 
TaskTracker.initialize() is only called once, by TaskTracker.run(). At startup 
time (and whenever the TaskTracker decides to lose all of its state and reset 
itself, which is essentially the same), the TaskTracker initializes the 
TrackerDistributedCacheManager object. This is also the only time it's allowed 
to "clear" the cache, since there are for certain no tasks running at 
initialization time that depend on those.
Ah, I was under the impression that distribute cache files are purged 
EXPLICITLY when a TT reinitializes, but it looks like we do a full delete on 
TaskTracker's local files during re-init. So, the code in your patch will work 
for now, but it won't be in future when we might need explicit purge of 
distributed cache files when they are owned by users themselves (HADOOP-4493). 
We will change it then, +1 for the current changes.

bq. JobClient.java What happens to the code in here? Should this be refactored 
too/ are we going to do something about it?
bq. Good question I do think that JobClient should be using some methods in the 
filecache package, instead of hard-coding a lot of logic. That said, I chose to 
stop somewhere to avoid making this patch even harder to review. I think it's 
ripe for a future JIRA.
Okay.

bq. I think most of the getter/setter methods are deprecable and moveable to 
DistributedCacheManager. Or at least we should give some thought about it. For 
most of them, I do see from the javadoc that they are only for internal use 
anyways.
bq. I agree. A few of them are used to manage the Configuration object. (In my 
mind, we're serializing and de-serializing a set of requirements for the 
distributed cache into the text configuration, and doing so a bit haphazardly.) 
I was very tempted to remove all the ones that are only meant to be internal, 
but Tom advised me that I need to keep them deprecated for a version. Again, I 
think moving those methods into a more private place is a good task to do along 
with changing how JobClient calls into this stuff.
+1. So are you planning to do in the next version or in this patch itself?

bq. I don't use Pipes typically, and it doesn't seem to compile on my Mac. I'll 
try it on a Linux machine, but if it's easy/handy for you, it'd be great to 
verify that bug.
Will do so.

bq. TestDistributedCacheManager * Code related to

[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-31 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737621#action_12737621
 ] 

Philip Zeyliger commented on MAPREDUCE-476:
---

Hi Vinod,

Thanks for the ping; got distracted by other things.  And thanks again for the 
detailed
review.  My responses are below.  I've generated a patch that shows the 
differences
between v2 and v4, and also the patch, in a state where it still depends on 
MAPREDUCE-711. 
is there anything blocking MAPREDUCE-711 that prevents it from being committed?

Also, sorry about the multiple uploads here.  I had a very clever bug in there
(caused by not thinking enough while resolving a merge conflict) that deleted
the current working directory, recursively, in one of the tests.  (TaskRunner
is hard-coded to delete current working directory, which is ok, since it's
typically a child process; not ok for LocalJobRunner.)

I've run the "relevant" tests; the full tests take a while, so I'm running those
in the background.

{quote}
$for i in TestMRWithDistributedCache TestMiniMRLocalFS TestMiniMRDFSCaching 
TestTrackerDistributedCacheManager; do; ant test -Dtestcase=$i > test-out-$i && 
echo "$i good" || echo "$i bad"; done
TestMRWithDistributedCache good
TestMiniMRLocalFS good
TestMiniMRDFSCaching good
TestTrackerDistributedCacheManager good
{quote}

bq. There is quite a bit of refactoring in this patch, though I find it really 
useful.

Yep.  Having DistributedCache work locally is easy if you refactor the code
a bit, so that's how I went at it.


bq. Please make sure that in the newly added code, lines aren't longer than 80 
characters. For e.g, see DistributedCacheManager.newTaskHandle() method.

A handful of "git diff foo..bar | egrep "^\+\+\+|^\+ .{80}" has done the trick,
I think.  The tricky bit is always fixing only the lines I've changed,
and not all the lines in a given file, to preserve history and keep
reviewing sane.

bq. Just a thought, can the classes be better renamed to reflect their usage, 
something like TrackerDistributedCacheManager and TaskDistributeCacheManager?

I like those names better; thanks.  Changed.

bq. DistributedCacheManager and DistributedCacheHandle: Explicitly state in 
javadoc that it is not a public interface 

Done.

bq. This class should also have the variable number argument getLocalCache() 
methods so that the corresponding methods in DistributedCache can be 
deprecated. Also, each method in DistributedCache should call the correponding 
method in DistributedCacheManager class.

Don't think I agree here.  We can deprecate the getLocalCache methods 
in DistributedCache right away.  They delegate to each other, and one of them
delegates to TrackerDistributedCacheManager.  Ideally, I'd remove these
altogether --- Hadoop internally does not use these methods with
this patch, and there's no sensible reason why someone else would,
but since it's public, it's getting deprecated.  But it's not
being deprecated with a pointer to use something else; it's getting
deprecated so that you don't use it at all.


bq. DistributedCacheHandle CacheFile.makeCacheFiles()
bq. isClassPath can be renamed to shouldBePutInClasspath
Renamed to shouldBeAddedToClasspath.

bq. paths can be renamed to pathsToPutInClasspath.
Renamed to pathsToBeAddedToClasspath

bq. Use .equals method at +150 if (cacheFile.type == CacheFile.FileType.ARCHIVE)

I believe that technically it doesn't matter.
The JDK implementation of equals() on java.lang.Enum is final, 
and hardcoded to "this==other".  This is the only thing that makes
sense, since there's only ever one instance of a given Enum.
I took an inaccurate look at the code base, and == is the more
common option.
{quote}
  # Inaccurate!  Not a static analysis!  Not even close! ;)
  [1]doorstop:hadoop-mapreduce(140142)$ack "\([a-zA-Z]*\.[a-zA-Z]*\.equals" src 
| wc -l
  11
  [0]doorstop:hadoop-mapreduce(140143)$ack "\([a-zA-Z]*\.[a-zA-Z]* ==" src | wc 
-l
  127
{quote}
If you feel strongly about this, happy to change it, but I think == is
more consistent.

bq. makeCacheFiles: boolean isArchive -> FileType fileType

done.

bq. I think it would be cleaner to return target instead of passing it as an 
argument.

Done.

bq. makeCacheFiles() method should be documented

Done.

bq. setup() This method is really useful, avoids a lot of code duplication!

Ok.

bq. Leave localTaskFile writing business back in TaskRunner itself. I think It 
is the task's responsiblity, not the DistributeCacheHandle's

Good call; done.

bq. cacheSubdir can better be an argument to setup() method instead of passing 
it to the constructor.

Good idea; done.

bq. getClassPaths() : Document that it has to be called and useful only when is 
already invoked.

Done.  I've made it throw an exception if it's called erroneously, since I could
see that causing trouble for developers.

bq. TaskTracker.initialize() A new DistributedCacheManager 

[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-30 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737391#action_12737391
 ] 

Philip Zeyliger commented on MAPREDUCE-476:
---

Never mind, trying to rush before leaving the office, and the tests fail here.  
Back tomorrow.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2-vs-v3.try2.patch, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, MAPREDUCE-476-v3.try2.patch, 
> MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-30 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737385#action_12737385
 ] 

Philip Zeyliger commented on MAPREDUCE-476:
---

Vinod,

Yes.  I've been hacking away at it today.  Please ignore those last two updated 
diffs: while getting rid of some 80+ character lines, I fumbled some git stuff 
and produced bad patches.  I'll be producing good ones after some more sanity 
checking either late today or tomorrow morning.

-- Philip

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2-vs-v3.patch, MAPREDUCE-476-v2.patch, MAPREDUCE-476-v3.patch, 
> MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-30 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737042#action_12737042
 ] 

Vinod K V commented on MAPREDUCE-476:
-

Philip, any update on this issue?

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-17 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732547#action_12732547
 ] 

Philip Zeyliger commented on MAPREDUCE-476:
---

Vinod,

Thanks for your comments and thorough review.  I'll take a closer look over the 
next couple of days and post a new patch.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-17 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732482#action_12732482
 ] 

Vinod K V commented on MAPREDUCE-476:
-

Another point. We should consolidate TestMRWithDistributedCache, 
TestMiniMRLocalFS and TestMiniMRDFSCaching. That's a lot of code in three 
different files testing mostly common code. May be another JIRA issue if you 
feel so.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-17 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732474#action_12732474
 ] 

Vinod K V commented on MAPREDUCE-476:
-

Looked at the remaining parts of the patch. Some more comments.

DistributedCacheHandle.makeClassLoader
 - use File.toURI.toURL() instead of File.toURL() directly. File.toURL() is 
deprecated.

LocalJobRunner.java
 - TaskRunner.setupWorkDir needs to be fixed to have a workDir arg to help the 
code in LocalJobRunner. Compilation(ant binary) breaks now.
 - TODO: Test pipes with LocalJobRunner and make sure that it works, i.e. 
verify HADOOP-5174.

TestDistributedCacheManager
 - Code related to setFileStamps in JobClient.java (+636 to +656) and 
testManagerFlow() (+71 to +74) can be refactored into an internal-only method 
in DistributedCacheManager.
 - Minor: assertNonNull(+90) check can be moved after +91 and use 
localCacheFiles for the null check

TestMRWithDistributedCache
 - Just documenting a TODO: Will need to fix this class's javadoc once 
MAPREDUCE-711 is in.
 - Fix the comment at +84. It is actually two archives.
 - Please put a comment for the class saying that it is NOT A FAST test, 
keeping in view the recent efforts to have a fast test target.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-16 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731902#action_12731902
 ] 

Vinod K V commented on MAPREDUCE-476:
-

I am still to look at the changes in LocalJobRunner (the real part of the patch 
:) ) and the test-cases.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-16 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731901#action_12731901
 ] 

Vinod K V commented on MAPREDUCE-476:
-

Some more comments:

TaskTracker.initialize()
 - A new DistributedCacheManager is created every time, so old files will not 
be deleted by the subsequent purgeCache. Either we should have only one 
cacheManager for a TaskTracker or DistributedCacheManager.cachedArchives should 
be static.
 - The same problem exists with the deprecated purgeCache() method in 
DistributedCache.java

JobClient.java 
 - What happens to the code in here? Should this be refactored too/ are we 
going to do something about it?

DistributedCache.java
 - getLocalCache() (+190) should be indirected to 
DistributedCacheManager.getLocalCache(). Otherwise there is a stack overflow 
error - the method calls itself as of now.
 - I think most of the getter/setter methods are deprecable and moveable to 
DistributedCacheManager. Or at least we should give some thought about it. For 
most of them, I do see from the javadoc that they are only for internal use 
anyways.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-476) extend DistributedCache to work locally (LocalJobRunner)

2009-07-16 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731899#action_12731899
 ] 

Vinod K V commented on MAPREDUCE-476:
-

I've started looking at your latest patch in combination with an intermediate 
patch for MAPREDUCE-711.  There is quite a bit of refactoring in this patch, 
though I find it really useful. Just to be sure, I will call this out 
explicitly to Devaraj/Owen offline.

I have some comments in relation to the patch.

General:
 - Please make sure that in the newly added code, lines aren't longer than 80 
characters. For e.g, see DistributedCacheManager.newTaskHandle() method.
 - Just a thought, can the classes be better renamed to reflect their usage, 
something like TrackerDistributedCacheManager and TaskDistributeCacheManager?

DistributedCacheManager
 - Explicitly state in javadoc that it is not a public interface :)
 - This class should also have the variable number argument getLocalCache() 
methods so that the corresponding methods in DistributedCache can be 
deprecated. Also, each method in DistributedCache should call the correponding 
method in DistributedCacheManager class.
 
DistributedCacheHandle
  - Explicitly state in javadoc that it is not a public interface

  - CacheFile.makeCacheFiles()
-- isClassPath can be renamed to shouldBePutInClasspath
-- paths can be renamed to pathsToPutInClasspath.
-- Use .equals method at +150   {code}if (cacheFile.type == 
CacheFile.FileType.ARCHIVE){code}
--{code}static void makeCacheFiles(URI[] uris, String[] timestamps, 
Path[] paths,
   boolean isArchive, List target){code} should instead be
 {code}static void makeCacheFiles(URI[] uris, String[] timestamps, 
Path[] paths,
   FileType fileType, List target){code}
 -- I think it would be cleaner to return target instead of passing it as 
an argument.
 -- makeCacheFiles() method should be documented

  - setup()
-- This method is really useful, avoids a lot of code duplication!
-- Leave localTaskFile writing business back in TaskRunner itself. I think 
It is the task's responsiblity, not the DistributeCacheHandle's
-- cacheSubdir can better be an argument to setup() method instead of 
passing it to the constructor.

  - getClassPaths() : Document that it has to be called and useful only when is 
already invoked.

> extend DistributedCache to work locally (LocalJobRunner)
> 
>
> Key: MAPREDUCE-476
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-476
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: sam rash
>Assignee: Philip Zeyliger
>Priority: Minor
> Attachments: HADOOP-2914-v1-full.patch, 
> HADOOP-2914-v1-since-4041.patch, HADOOP-2914-v2.patch, HADOOP-2914-v3.patch, 
> MAPREDUCE-476-v2.patch, MAPREDUCE-476.patch
>
>
> The DistributedCache does not work locally when using the outlined recipe at 
> http://hadoop.apache.org/core/docs/r0.16.0/api/org/apache/hadoop/filecache/DistributedCache.html
>  
> Ideally, LocalJobRunner would take care of populating the JobConf and copying 
> remote files to the local file sytem (http, assume hdfs = default fs = local 
> fs when doing local development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.