-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6536/#review10347
-----------------------------------------------------------


Have a few comments.


contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroSchemaManager.java
<https://reviews.apache.org/r/6536/#comment21961>

    Can you elaborate on this error message? What is the recommended course of 
action when the user sees this error message?
    
    Minor comment: Please use braces to future proof accidental mistakes in the 
future, i..e, 
    if () { ...
    }



contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
<https://reviews.apache.org/r/6536/#comment21967>

    The no_schema_check can now appear in any position, i.e., its no longer to 
required for it to be the first argument?



contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
<https://reviews.apache.org/r/6536/#comment21966>

    I am a little concerned with this change. For loop counters, to be 
conservative, I would avoid making changes in the body of the loop. Its hard to 
detect these changes and harder to maintain.
    
    Is there a better way of implementing this?



contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
<https://reviews.apache.org/r/6536/#comment21968>

    Can you use parenthesis to make this more readable?



contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
<https://reviews.apache.org/r/6536/#comment22058>

    Can this be changed to return containsGenericUnion(fs, visitedRecords) to 
be consistent with the other parts of the code?



contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
<https://reviews.apache.org/r/6536/#comment22059>

    Can this be changed to return containsGenericUnion(fs, visitedRecords) to 
be consistent with the other parts of the code?



contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
<https://reviews.apache.org/r/6536/#comment22126>

    There should be three rows with the third row being NULL.



contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
<https://reviews.apache.org/r/6536/#comment22127>

    Can you include a message here. Something like "Negative test to test an 
exception. Should not be succeeding!" 



contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
<https://reviews.apache.org/r/6536/#comment22128>

    Can you include a message here. Something like "Negative test to test an 
exception. Should not be succeeding!" 



contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
<https://reviews.apache.org/r/6536/#comment22129>

    Can you include a message here. Something like "Negative test to test an 
exception. Should not be succeeding!" 



contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
<https://reviews.apache.org/r/6536/#comment22130>

    Not sure if this holds true for all cases. This is an either or - i.e., if 
a sequence of jobs has one failure then the assertion will kick in for the 
first one which is actually a false alarm.


- Santhosh Srinivasan


On Aug. 14, 2012, 10:23 a.m., Cheolsoo Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6536/
> -----------------------------------------------------------
> 
> (Updated Aug. 14, 2012, 10:23 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Description
> -------
> 
> Allow recursive records to be loaded/stored by AvroStorage.
> 
> The changes include:
> 
> 1) Remove the recursive record check from AvroSchema2Pig.
> 2) Modofy inconvert() in AvroSchema2Pig so that it can map recursive records 
> to bytearrays.
> 3) Modify containsGenericUnion() in AvroStorageUtils so that it can handle 
> Avro schema that contains recursive records.
> 4) Update the parameter parsing in AvroStorage so that 'no_schema_check' can 
> be passed to both LoadFunc and StoreFunc.
> 5) Add the recursive record check to AvroSchemaManager. This is needed 
> because 'schema_file' and 'data' cannot refer to avro schema that contains 
> recursive records.
> 
> AvroStorage works as follows:
> 
> 1) PigSchema maps recursive records to bytearrays, so there is discrepancy 
> between Avro schema and Pig schema.
> 2) Recursive records are loaded as tuples even though Pig schema defines them 
> as bytearrays and can be referred to by position (e.g. $0, $1.$0, etc).
> 3) To store recursive records, Avro schema must be provided via the 'schema' 
> or 'same' parameter in StoreFunc. In addition, 'no_schema_check' must be 
> enabled because otherwise schema check will fail due to discrepancy between 
> Avro schema and Pig schema.
> 4) Avro schema cannot be specified by the 'data' or 'schema_file' parameter. 
> This is because AvroSchemaManager cannot handle recursive records for now. 
> The recursive record check is added to AvroSchemaManager, so if Avro schema 
> that contains recursive records is specified by these parameters, an 
> exception is thrown.
> 
> 
> This addresses bug PIG-2875.
>     https://issues.apache.org/jira/browse/PIG-2875
> 
> 
> Diffs
> -----
> 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroSchema2Pig.java
>  6b1d2a1 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroSchemaManager.java
>  1939d3e 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java
>  c9f7d81 
>   
> contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
>  e24b495 
>   
> contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java
>  2fab3f7 
>   
> contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java
>  040234f 
>   
> contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_generic_union_schema.avro
>  4e23e73 
>   
> contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_recursive_schema.avro
>  12a36f8 
> 
> Diff: https://reviews.apache.org/r/6536/diff/
> 
> 
> Testing
> -------
> 
> New test cases are added as follows:
> 
> 1) Load/store Avro files that contain recursive records in array, map, union, 
> and another record.
> 2) Load Avro files that contains recursive records, generate new relations, 
> apply filters, and store them as non-recursive records.
> 3) Tests for the StoreFunc parameters: no_schema_check, schema, same, 
> schema_file, and data.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>

Reply via email to