InvalidJobConfException

2012-06-08 Thread huanchen.zhang
Hi,

Here I'm developing a MapReduce web crawler which reads url lists and writes 
html to MongoDB.
So, each map read one url list file, get the html and insert to MongoDB. There 
is no reduce and no output of map. So, how to set the output directory in this 
case? If I do not set the output directory, it gives me following exception,

Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: 
Output directory not set.
at 
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
at 
com.ipinyou.data.preprocess.mapreduce.ExtractFeatureFromURLJob.main(ExtractFeatureFromURLJob.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)


Thank you ! 

Best,
Huanchen
  

2012-06-08 



huanchen.zhang 


Re: InvalidJobConfException

2012-06-08 Thread Harsh J
Hi Huanchen,

Just set your output format class to NullOutputFormat
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/output/NullOutputFormat.html
if you don't need any direct outputs to HDFS/etc. from your M/R
classes.

On Fri, Jun 8, 2012 at 4:16 PM, huanchen.zhang
 wrote:
> Hi,
>
> Here I'm developing a MapReduce web crawler which reads url lists and writes 
> html to MongoDB.
> So, each map read one url list file, get the html and insert to MongoDB. 
> There is no reduce and no output of map. So, how to set the output directory 
> in this case? If I do not set the output directory, it gives me following 
> exception,
>
> Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: 
> Output directory not set.
>        at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872)
>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>        at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
>        at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
>        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
>        at 
> com.ipinyou.data.preprocess.mapreduce.ExtractFeatureFromURLJob.main(ExtractFeatureFromURLJob.java:56)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>
>
> Thank you !
>
> Best,
> Huanchen
>
>
> 2012-06-08
>
>
>
> huanchen.zhang



-- 
Harsh J


RE: InvalidJobConfException

2012-06-08 Thread Devaraj k
By default it uses the TextOutputFomat(subclass of FileOutputFormat) which 
checks for output path. 

You can use NullOuputFormat or your custom output format which doesn't do any 
thing for your job.



Thanks
Devaraj


From: huanchen.zhang [huanchen.zh...@ipinyou.com]
Sent: Friday, June 08, 2012 4:16 PM
To: common-user
Subject: InvalidJobConfException

Hi,

Here I'm developing a MapReduce web crawler which reads url lists and writes 
html to MongoDB.
So, each map read one url list file, get the html and insert to MongoDB. There 
is no reduce and no output of map. So, how to set the output directory in this 
case? If I do not set the output directory, it gives me following exception,

Exception in thread "main" org.apache.hadoop.mapred.InvalidJobConfException: 
Output directory not set.
at 
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
at 
com.ipinyou.data.preprocess.mapreduce.ExtractFeatureFromURLJob.main(ExtractFeatureFromURLJob.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)


Thank you !

Best,
Huanchen


2012-06-08



huanchen.zhang