
Yes this is an unfortunate edge case. Though, this is fixed in the
trunk/2.x client rewrite and tracked as a test now by

On Fri, Oct 5, 2012 at 10:28 PM, Bertrand Dechoux <> wrote:
> Hi,
> I am launching my job using the command line and I observed that when the
> provided input path do not match any files, the jar in the staging
> repository is not removed.
> It is removed on job termination (success or failure) but here the job isn't
> even really started so it may be an edge case.
> Has anyone seen the same behaviour? (I am using 1.0.3)
> Here is an extract of the stack trace with hadoop related classes.
>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
>> does not exist: [removed]
>>         at
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(
>>         at
>> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(
>>         at
>> org.apache.hadoop.mapred.JobClient.writeNewSplits(
>>         at
>> org.apache.hadoop.mapred.JobClient.writeSplits(
>>         at
>> org.apache.hadoop.mapred.JobClient.access$500(
>>         at org.apache.hadoop.mapred.JobClient$
>>         at org.apache.hadoop.mapred.JobClient$
>>         at Method)
>>         at
>>         at
>>         at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(
>>         at org.apache.hadoop.mapreduce.Job.submit(
>>         at org.apache.hadoop.mapreduce.Job.waitForCompletion(
> Second question is a bit related because one of its consequence would
> nullify the impact of the above 'bug'.
> Is it possible to set directly the main job jar as a jar already inside
> From what I know, the configuration points to a local jar archive which is
> uploaded each time to the staging repository.
> The same question was asked in the jira but without clear resolution.
> My question might be related to
> which is resolved for next version. But it seems to be only about uberjar
> and I am using a standard jar.
> If it works with a hdfs location, what are the details? Won't it be cleaned
> during job termination? Why not? Will it also be setup within the
> distributed cache?
> Regards
> Bertrand
> PS : I know there are others solutions to my problem. I will look at Oozie.
> And worst case, I can create a FileSystem instance myself to check whether
> the job should be really launched or not. Both could work but both seem
> overkill in my context.

Harsh J

Reply via email to