The value you're specifying for io.serializations below is incorrect:
property
nameio.serializations/name
valueorg.apache.avro.mapred.AvroSerialization,
avro.serialization.key.reader.schema,
avro.serialization.value.reader.schema,
avro.serialization.key.writer.schema,avro.serialization.value.writer.schema
/value
/property
If the goal is to include org.apache.avro.mapred.AvroSerialization,
then it should look more like:
property
nameio.serializations/name
valueorg.apache.hadoop.io.serializer.WritableSerialization,org.apache.hadoop.io.serializer.avro.AvroSpecificSerialization,org.apache.hadoop.io.serializer.avro.AvroReflectSerialization,org.apache.avro.mapred.AvroSerialization/value
/property
That is, it must be an extension of the default values, and not a
replacement of them.
On Wed, Mar 13, 2013 at 4:05 AM, M, Paul pa...@iqt.org wrote:
Hello,
I am trying to run an M/R job with Avro serialization via Oozie. I've made
some progress in the workflow.xml, however I am still running into the
following error. Any thoughts? I believe it may have to do with the
io.serializations property below. FYI, I am using CDH 4.2.0 mr1.
2013-03-12 15:24:32,334 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_20130318_0080_m_00_3: java.lang.NullPointerException
at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:356)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:389)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1407)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
action name=mr-node
map-reduce
job-tracker${jobTracker}/job-tracker
name-node${nameNode}/name-node
prepare
delete path=${nameNode}/user/${wf:user()}/${outputDir} /
/prepare
configuration
property
namemapred.job.queue.name/name
value${queueName}/value
/property
property
namemapreduce.reduce.class/name
valueorg.apache.avro.mapred.HadoopReducer/value
/property
property
namemapreduce.map.class/name
valueorg.apache.avro.mapred.HadoopMapper/value
/property
property
nameavro.reducer/name
valueorg.my.project.mapreduce.CombineAvroRecordsByHourReducer
/value
/property
property
nameavro.mapper/name
valueorg.my.project.mapreduce.ParseMetadataAsTextIntoAvroMapper
/value
/property
property
namemapreduce.inputformat.class/name
valueorg.my.project.mapreduce.NonSplitableInputFormat/value
/property
!-- Key Value Mapper --
property
nameavro.output.schema/name
value{type:record,name:Pair,namespace:org.apache.avro.mapred,fields:...}]}
/value
/property
property
namemapred.mapoutput.key.class/name
valueorg.apache.avro.mapred.AvroKey/value
/property
property
namemapred.mapoutput.value.class/name
valueorg.apache.avro.mapred.AvroValue/value
/property
property
nameavro.schema.output.key/name
value{type:record,name:DataRecord,namespace:...]}]}
/value
/property
property
namemapreduce.outputformat.class/name
valueorg.apache.hadoop.mapreduce.lib.output.TextOutputFormat
/value
/property
property
namemapred.output.key.comparator.class/name
valueorg.apache.avro.mapred.AvroKeyComparator/value
/property
property
nameio.serializations/name
valueorg.apache.avro.mapred.AvroSerialization,
avro.serialization.key.reader.schema,
avro.serialization.value.reader.schema,
avro.serialization.key.writer.schema,avro.serialization.value.writer.schema
/value
/property
property
namemapred.map.tasks/name
value1/value
/property
!--Input/Output --
property
namemapred.input.dir/name
value/user/${wf:user()}/input//value
/property
property
namemapred.output.dir/name
value/user/${wf:user()}/${outputDir}/value
/property
/configuration
/map-reduce
--
Harsh J