----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6536/#review10529 -----------------------------------------------------------
Ship it! +1. Will run the tests again. - Santhosh Srinivasan On Aug. 20, 2012, 6:40 a.m., Cheolsoo Park wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6536/ > ----------------------------------------------------------- > > (Updated Aug. 20, 2012, 6:40 a.m.) > > > Review request for pig. > > > Description > ------- > > Allow recursive records to be loaded/stored by AvroStorage. > > The changes include: > > 1) Remove the recursive record check from AvroSchema2Pig. > 2) Modofy inconvert() in AvroSchema2Pig so that it can map recursive records > to bytearrays. > 3) Modify containsGenericUnion() in AvroStorageUtils so that it can handle > Avro schema that contains recursive records. > 4) Update the parameter parsing in AvroStorage so that 'no_schema_check' can > be passed to both LoadFunc and StoreFunc. > 5) Add the recursive record check to AvroSchemaManager. This is needed > because 'schema_file' and 'data' cannot refer to avro schema that contains > recursive records. > > AvroStorage works as follows: > > 1) PigSchema maps recursive records to bytearrays, so there is discrepancy > between Avro schema and Pig schema. > 2) Recursive records are loaded as tuples even though Pig schema defines them > as bytearrays and can be referred to by position (e.g. $0, $1.$0, etc). > 3) To store recursive records, Avro schema must be provided via the 'schema' > or 'same' parameter in StoreFunc. In addition, 'no_schema_check' must be > enabled because otherwise schema check will fail due to discrepancy between > Avro schema and Pig schema. > 4) Avro schema cannot be specified by the 'data' or 'schema_file' parameter. > This is because AvroSchemaManager cannot handle recursive records for now. > The recursive record check is added to AvroSchemaManager, so if Avro schema > that contains recursive records is specified by these parameters, an > exception is thrown. > > > This addresses bug PIG-2875. > https://issues.apache.org/jira/browse/PIG-2875 > > > Diffs > ----- > > > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroSchema2Pig.java > 6b1d2a1 > > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroSchemaManager.java > 1939d3e > > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java > c9f7d81 > > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java > e24b495 > > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java > 2fab3f7 > > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java > 040234f > > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_generic_union_schema.avro > 4e23e73 > > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_recursive_schema.avro > 12a36f8 > > Diff: https://reviews.apache.org/r/6536/diff/ > > > Testing > ------- > > New test cases are added as follows: > > 1) Load/store Avro files that contain recursive records in array, map, union, > and another record. > 2) Load Avro files that contains recursive records, generate new relations, > apply filters, and store them as non-recursive records. > 3) Tests for the StoreFunc parameters: no_schema_check, schema, same, > schema_file, and data. > > > Thanks, > > Cheolsoo Park > >