Rohini Palaniswamy created PIG-5106:
---------------------------------------

             Summary: Optimize when 
mapreduce.input.fileinputformat.input.dir.recursive set to true
                 Key: PIG-5106
                 URL: https://issues.apache.org/jira/browse/PIG-5106
             Project: Pig
          Issue Type: Bug
            Reporter: Rohini Palaniswamy
             Fix For: 0.17.0


Many of our classes extending InputFormat have

{code}
/*
     * This is to support multi-level/recursive directory listing until
     * MAPREDUCE-1577 is fixed.
     */
    @Override
    protected List<FileStatus> listStatus(JobContext job) throws IOException {  
     
        return MapRedUtil.getAllFileRecursively(super.listStatus(job),
                job.getConfiguration());            
    }
{code}

Now that we have dropped Hadoop 1.x, it can be optimized to 

{code}
if (getInputDirRecursive(job)) {
            return super.listStatus(job);
        } else {
            /*
             *  mapreduce.input.fileinputformat.input.dir.recursive is not true
             *  by default for backward compatibility reasons.
             */
            return MapRedUtil.getAllFileRecursively(super.listStatus(job), 
                job.getConfiguration());     
        }
{code}

That would avoid one extra iteration when  
mapreduce.input.fileinputformat.input.dir.recursive is set to true by users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to