[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017641#comment-14017641
 ] 

Remus Rusanu commented on MAPREDUCE-5196:
-----------------------------------------

Hi [~curino],

Can you shed some light on the rationale of this change:
{code}
@@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException {
     if (isMapTask() && conf.getNumReduceTasks() > 0) {
       try {
         Path mapOutput =  mapOutputFile.getOutputFile();
-        FileSystem localFS = FileSystem.getLocal(conf);
-        return localFS.getFileStatus(mapOutput).getLen();
+        FileSystem fs = mapOutput.getFileSystem(conf);
+        return fs.getFileStatus(mapOutput).getLen();
       } catch (IOException e) {
         LOG.warn ("Could not find output size " , e);
       }
{code}
This breaks Windows deployments as the local files get get routed through HDFS:
{code}
c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_000000_0/file.out
 is not a valid DFS filename.
       at 
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187)
       at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101)
       at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024)
       at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020)
       at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
       at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020)
       at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124)
       at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102)
{code}





> CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing 
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5196
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>            Reporter: Carlo Curino
>            Assignee: Carlo Curino
>             Fix For: 3.0.0
>
>         Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch, 
> MAPREDUCE-5196.3.patch, MAPREDUCE-5196.patch, MAPREDUCE-5196.patch
>
>
> This JIRA tracks a checkpoint-based AM preemption policy. The policy handles 
> propagation of the preemption requests received from the RM to the 
> appropriate tasks, and bookeeping of checkpoints. Actual checkpointing of the 
> task state is handled in upcoming JIRAs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to