[ https://issues.apache.org/jira/browse/AVRO-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13579545#comment-13579545 ]
Ashish Nagavaram commented on AVRO-1215: ---------------------------------------- Hi Doug, I assume you are referring to the write(k,v,schema,schema,baseoutputfile), it still has a call to create context only that I moved it after schema has been assigned to the AvroJob. Thanks Johannes for testing the patch. > AvroMultipleOutputs not working when specifying baseOutputPath > -------------------------------------------------------------- > > Key: AVRO-1215 > URL: https://issues.apache.org/jira/browse/AVRO-1215 > Project: Avro > Issue Type: Bug > Components: java > Affects Versions: 1.7.2 > Reporter: Matthew Hayes > Assignee: Ashish Nagavaram > Labels: avro, mapreduce > Attachments: AVRO-1215.patch, AVRO-1215.patch, AVRO-1215.patch, > AVRO-1215-v3.patch > > > I'm calling the write() method of AvroMultipleOutputs which takes the > baseOutputPath. The reducer appears to begin hanging once it tries writing > to a baseOuputPath value not already encountered. It then fails with: > org.apache.hadoop.ipc.RemoteException: > org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to > create file ... because current leaseholder is trying to recreate file. > I think the problem has to do with this line in AvroMultipleOutputs: > {code} > // get the record writer from context output format > //FileOutputFormat.setOutputName(taskContext, baseFileName); > {code} > This line is not commented out in the similar code from Hadoop. So I think > the baseOutputPath is ignored. As a result when each record writer is > created it uses the same path, leading to the exception. > Uncommenting this line does not work because of visibility of the method. > However what this method does is set "mapreduce.output.basename". But > setting this doesn't work either. > After digging through Avro code I found that AvroOutputFormatBase is using > "avro.mo.config.namedOutput" to create the path. If I replace the commented > out line with this it seems to work: > {code} > taskContext.getConfiguration().set("avro.mo.config.namedOutput", > baseFileName); > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira