> On Dec. 3, 2012, 7:22 p.m., Cheolsoo Park wrote:
> > Overall looks great! I haven't gone through the test cases yet, but here 
> > are my comments so far.
> > 
> > 
> > 1) I noticed that I cannot load .avro files that are not record types. For 
> > example, I tried to load a .avro file whose schema is "int" as follows:
> > 
> > [cheolsoo@cheolsoo-mr1-0 pig-svn]$ java -jar avro-tools-1.5.4.jar getschema 
> > foo2/test_int.avro 
> > "int"
> > 
> > [cheolsoo@cheolsoo-mr1-0 pig-svn]$ java -jar avro-tools-1.5.4.jar tojson 
> > foo2/test_int.avro 
> > 1
> > 
> > in = LOAD 'foo2/test_int.avro' USING AvroStorage('int');
> > DUMP in;
> > 
> > This gives me the following error:
> > 
> > Caused by: java.io.IOException: avroSchemaToResourceSchema only processes 
> > records
> > 
> > Can only Avro record type be loaded? Or am I doing something wrong?
> > 
> > 
> > 2) TestAvroStorage needs to be more automated. To run it, I had to run the 
> > following commands:
> > 
> > ant clean compile-test
> > cd ./test/org/apache/pig/builtin/avro
> > python createests.py
> > cd -
> > ant clean test -Dtestcase=TestAvroStorage
> > 
> > Ideally, I should be able to run a single command: ant clean 
> > -Dtestcase=TestAvroStorage. Please let me know if you need help for this.
> > 
> > 
> > 3) python createests.py fails with the following errors. I suppose that 
> > some files are missing:
> > 
> > creating data/avro/uncompressed/testDirectoryCounts.avro
> > Exception in thread "main" java.io.FileNotFoundException: 
> > data/json/testDirectoryCounts.json (No such file or directory)
> > ...
> > creating evenFileNameTestDirectoryCounts.avro
> > Exception in thread "main" java.io.FileNotFoundException: 
> > data/json/evenFileNameTestDirectoryCounts.json (No such file or directory)
> > ...
> > 
> > 
> > 4) ant test -Dtestcase=TestAvroStorage fails with the following errors. I 
> > suppose that this is due to the missing files:
> > 
> > Testcase: testLoadDirectory took 0.005 sec
> >     FAILED
> > Testcase: testLoadGlob took 0.004 sec
> >     FAILED
> > Testcase: testPartialLoadGlob took 0.005 sec
> >     FAILED
> > 
> > 
> > 5) Typo in the name of createests.py. It should be createtests.py.
> > 
> > 
> > 6) Is createTests.bash needed at all? If not, can you remove it?
> > 
> > 
> > I have more comments inline:

Sounds like the python script isn't working completely correctly. I'll debug 
that script and make sure it generates all the required files.

Can I take you up on your offer to help automate that build process? I'm not 
exactly sure what to modify to automatically run the python script to create 
the test files.


> On Dec. 3, 2012, 7:22 p.m., Cheolsoo Park wrote:
> > src/org/apache/pig/builtin/AvroStorage.java, lines 296-305
> > <https://reviews.apache.org/r/8104/diff/1/?file=191564#file191564line296>
> >
> >     This won't work in the following case. Let's say p matches two dirs, 
> > and one dir is empty.
> >     
> >     p = foo*
> >     
> >     foo1
> >     foo2/bar.avro
> >     
> >     I would expect the schema of bar.avro is returned, but I get an 
> > IOException instead.

Added proper depth first search to find the first file. (I decided to sort by 
modification date, most recent first.)


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8104/#review13962
-----------------------------------------------------------


On Nov. 17, 2012, 5:28 a.m., Joseph Adler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8104/
> -----------------------------------------------------------
> 
> (Updated Nov. 17, 2012, 5:28 a.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Description
> -------
> 
> The current AvroStorage implementation has a lot of issues: it requires old 
> versions of Avro, it copies data much more than needed, and it's verbose and 
> complicated. (One pet peeve of mine is that old versions of Avro don't 
> support Snappy compression.)
> 
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
> new implementation is significantly faster, and the code is a lot simpler. 
> Rewriting AvroStorage also enabled me to implement support for Trevni.
> 
> This is the latest version of the patch, complete with test cases and 
> TrevniStorage. (Test cases for TrevniStorage are still missing).
> 
> 
> This addresses bug PIG-3015.
>     https://issues.apache.org/jira/browse/PIG-3015
> 
> 
> Diffs
> -----
> 
>   build.xml 7d468a0 
>   ivy.xml 70e8d50 
>   ivy/libraries.properties 317564f 
>   src/org/apache/pig/builtin/AvroStorage.java PRE-CREATION 
>   src/org/apache/pig/builtin/TrevniStorage.java PRE-CREATION 
>   src/org/apache/pig/impl/util/AvroBagWrapper.java PRE-CREATION 
>   src/org/apache/pig/impl/util/AvroMapWrapper.java PRE-CREATION 
>   src/org/apache/pig/impl/util/AvroRecordReader.java PRE-CREATION 
>   src/org/apache/pig/impl/util/AvroRecordWriter.java PRE-CREATION 
>   src/org/apache/pig/impl/util/AvroStorageDataConversionUtilities.java 
> PRE-CREATION 
>   src/org/apache/pig/impl/util/AvroStorageSchemaConversionUtilities.java 
> PRE-CREATION 
>   src/org/apache/pig/impl/util/AvroTupleWrapper.java PRE-CREATION 
>   test/commit-tests 5081fbc 
>   test/org/apache/pig/builtin/TestAvroStorage.java PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/directory_test.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity_ai1_ao2.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity_ao2.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity_codec.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity_just_ao2.pig 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/recursive_tests.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/createTests.bash PRE-CREATION 
>   test/org/apache/pig/builtin/avro/createests.py PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/avro/deflate/records.avro 
> PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/deflate/recordsAsOutputByPig.avro 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/avro/snappy/records.avro PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordWithRepeatedSubRecords.avro
>  PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/avro/uncompressed/records.avro 
> PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsAsOutputByPig.avro
>  PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsOfArrays.avro 
> PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsOfArraysOfRecords.avro
>  PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsSubSchema.avro 
> PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsSubSchemaNullable.avro
>  PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsWithEnums.avro 
> PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsWithFixed.avro 
> PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsWithMaps.avro 
> PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsWithMapsOfRecords.avro
>  PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recordsWithNullableUnions.avro
>  PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/avro/uncompressed/recursiveRecord.avro 
> PRE-CREATION 
>   
> test/org/apache/pig/builtin/avro/data/json/recordWithRepeatedSubRecords.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/records.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsAsOutputByPig.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsOfArrays.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsOfArraysOfRecords.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsSubSchema.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsSubSchemaNullable.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsWithEnums.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsWithFixed.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsWithMaps.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsWithMapsOfRecords.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsWithNullableUnions.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recursiveRecord.json 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordWithRepeatedSubRecords.avsc 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/records.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsAsOutputByPig.avsc 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsOfArrays.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsOfArraysOfRecords.avsc 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsSubSchema.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsSubSchemaNullable.avsc 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsWithEnums.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsWithFixed.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsWithMaps.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsWithMapsOfRecords.avsc 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsWithNullableUnions.avsc 
> PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recursiveRecord.avsc PRE-CREATION 
>   test/unit-tests 0f18a0e 
> 
> Diff: https://reviews.apache.org/r/8104/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Joseph Adler
> 
>

Reply via email to