----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6536/#review10347 -----------------------------------------------------------
Have a few comments. contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroSchemaManager.java <https://reviews.apache.org/r/6536/#comment21961> Can you elaborate on this error message? What is the recommended course of action when the user sees this error message? Minor comment: Please use braces to future proof accidental mistakes in the future, i..e, if () { ... } contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java <https://reviews.apache.org/r/6536/#comment21967> The no_schema_check can now appear in any position, i.e., its no longer to required for it to be the first argument? contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java <https://reviews.apache.org/r/6536/#comment21966> I am a little concerned with this change. For loop counters, to be conservative, I would avoid making changes in the body of the loop. Its hard to detect these changes and harder to maintain. Is there a better way of implementing this? contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java <https://reviews.apache.org/r/6536/#comment21968> Can you use parenthesis to make this more readable? contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java <https://reviews.apache.org/r/6536/#comment22058> Can this be changed to return containsGenericUnion(fs, visitedRecords) to be consistent with the other parts of the code? contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java <https://reviews.apache.org/r/6536/#comment22059> Can this be changed to return containsGenericUnion(fs, visitedRecords) to be consistent with the other parts of the code? contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java <https://reviews.apache.org/r/6536/#comment22126> There should be three rows with the third row being NULL. contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java <https://reviews.apache.org/r/6536/#comment22127> Can you include a message here. Something like "Negative test to test an exception. Should not be succeeding!" contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java <https://reviews.apache.org/r/6536/#comment22128> Can you include a message here. Something like "Negative test to test an exception. Should not be succeeding!" contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java <https://reviews.apache.org/r/6536/#comment22129> Can you include a message here. Something like "Negative test to test an exception. Should not be succeeding!" contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java <https://reviews.apache.org/r/6536/#comment22130> Not sure if this holds true for all cases. This is an either or - i.e., if a sequence of jobs has one failure then the assertion will kick in for the first one which is actually a false alarm. - Santhosh Srinivasan On Aug. 14, 2012, 10:23 a.m., Cheolsoo Park wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6536/ > ----------------------------------------------------------- > > (Updated Aug. 14, 2012, 10:23 a.m.) > > > Review request for pig. > > > Description > ------- > > Allow recursive records to be loaded/stored by AvroStorage. > > The changes include: > > 1) Remove the recursive record check from AvroSchema2Pig. > 2) Modofy inconvert() in AvroSchema2Pig so that it can map recursive records > to bytearrays. > 3) Modify containsGenericUnion() in AvroStorageUtils so that it can handle > Avro schema that contains recursive records. > 4) Update the parameter parsing in AvroStorage so that 'no_schema_check' can > be passed to both LoadFunc and StoreFunc. > 5) Add the recursive record check to AvroSchemaManager. This is needed > because 'schema_file' and 'data' cannot refer to avro schema that contains > recursive records. > > AvroStorage works as follows: > > 1) PigSchema maps recursive records to bytearrays, so there is discrepancy > between Avro schema and Pig schema. > 2) Recursive records are loaded as tuples even though Pig schema defines them > as bytearrays and can be referred to by position (e.g. $0, $1.$0, etc). > 3) To store recursive records, Avro schema must be provided via the 'schema' > or 'same' parameter in StoreFunc. In addition, 'no_schema_check' must be > enabled because otherwise schema check will fail due to discrepancy between > Avro schema and Pig schema. > 4) Avro schema cannot be specified by the 'data' or 'schema_file' parameter. > This is because AvroSchemaManager cannot handle recursive records for now. > The recursive record check is added to AvroSchemaManager, so if Avro schema > that contains recursive records is specified by these parameters, an > exception is thrown. > > > This addresses bug PIG-2875. > https://issues.apache.org/jira/browse/PIG-2875 > > > Diffs > ----- > > > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroSchema2Pig.java > 6b1d2a1 > > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroSchemaManager.java > 1939d3e > > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java > c9f7d81 > > contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java > e24b495 > > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java > 2fab3f7 > > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java > 040234f > > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_generic_union_schema.avro > 4e23e73 > > contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_recursive_schema.avro > 12a36f8 > > Diff: https://reviews.apache.org/r/6536/diff/ > > > Testing > ------- > > New test cases are added as follows: > > 1) Load/store Avro files that contain recursive records in array, map, union, > and another record. > 2) Load Avro files that contains recursive records, generate new relations, > apply filters, and store them as non-recursive records. > 3) Tests for the StoreFunc parameters: no_schema_check, schema, same, > schema_file, and data. > > > Thanks, > > Cheolsoo Park > >