AvroStorage fails to STORE when LOADing via PigStorage ------------------------------------------------------
Key: PIG-2195 URL: https://issues.apache.org/jira/browse/PIG-2195 Project: Pig Issue Type: Bug Reporter: Bill Graham Assignee: Bill Graham Reading data via {{PigStorage}} and writing it via {{AvroStorage}} fails with an exception like this {{java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.avro.generic.IndexedRecord}} The Pig script in this section of the documentation shows an example like this that fails: http://linkedin.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data#AvroStorage-PigsupportforAvrodata-A.Howtostoredataindifferentways. A workaround currently exists to produce avro from TSVs like this: {noformat} avro = LOAD 'inputPath/' AS (foo); STORE avro INTO 'outputPath/' USING oap.piggybank.storage.avro.AvroStorage( '{"data":"data_file.avro", "same":"data_file.avro", "field0":"def:bar"}'); {noformat} This is redundant though and {{data}} and {{same}} seem to indicate the same thing. This approach also requires an existing avro data file to exist. This patch will make the following alternate constructor syntax's work as well. # Read schema from an existing data file: {noformat} '{"data":"data_file.avro", "field0":"def:bar"}'); {noformat} # Read schema from an existing schema file: {noformat} '{"schema_file":"data_file.avsc", "field0":"def:bar"}'); {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira