Hi, can someone flash the light on AvroStorage class? Im passing JSON Avro schema as parameter to AvroStorage, the schema contains additional attribute *"extraAttribute"*. Problem is that avro file produced by Pig doesnt contain that attribute within AVRO Schema.
When I pass the same schema to AVRO API * Schema.parse <https://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/Schema.Parser.html>() *the AVRO file contains that extraAttribute inside AVRO Schema. e.g. STORE A INTO 'testOutput' USING org.apache.pig.piggybank.storage.avro.AvroStorage( 'schema', ' {"type":"record","name":"X", "fields":[ {"name":"b1"}, {"name":"b2"}, {"name":"b3", *"extraAttribute" : "value"*} ]}'); Where is that extraAttribute removed inside the AvroStorage.java? I cant see any additional attributes removal. outAvroSchema uses method Schema.parse() AvroStorage.java @SuppressWarnings("rawtypes") @Override public OutputFormat getOutputFormat() throws IOException { AvroStorageLog.funcCall("getOutputFormat"); Properties property = getUDFProperties(); String allSchemaStr = property.getProperty(AVRO_OUTPUT_SCHEMA_PROPERTY); Map<String, String> map = (allSchemaStr != null) ? parseSchemaMap(allSchemaStr) : null; String key = getSchemaKey(); Schema schema = (map == null || !map.containsKey(key)) ? *outputAvroSchema : Schema.parse(map.get(key));* if (schema == null) throw new IOException("Output schema is null!"); AvroStorageLog.details("Output schema=" + schema); return new PigAvroOutputFormat(*schema*); } Any idea? Thank you!
