I have my own class as for the OutputCollector defined as following inside the
file:
public static class Turple
{
public int name;
public int ID;
}
and in main method, I use conf.setOutputValueClass(Turple.class) in order to
specify output collector class type.
inside map, combiner, and reduce classes, I simply store some information into
a new Turple and output them:
Turple turp=new Turple();
turp.name=ddddd;
turp.ID=1111;
output.collect(key,turp);
All those map, reduce and combiner's output type are set to Turple already.
The code, still, compiled successfully but keep remind me following info when
running:
-----------------------------------------------------------------------------------------
08/08/20 14:03:44 INFO mapred.JobClient: Task Id :
task_200808161218_0082_m_000001_0, Status : FAILED
java.lang.NullPointerException
at
org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:373)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:185)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
----------------------------------------------------------------------------------------------------
It seems there're some problem with 'serilization'. I think that should be
caused by miss using java class.
Apprecaite any helps from you guys . Thanks!
Kunsheng