[
https://issues.apache.org/jira/browse/PIG-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheolsoo Park updated PIG-2837:
-------------------------------
Status: Patch Available (was: Open)
It seems that AvroStorage does not support recursive record and generic union:
{quote}
1. Limited support for "record": we do not support recursively defined record
because the number of fields in such records is data dependent.
2. Limited support for "union": we only accept nullable union like ["null",
"some-type"].
{quote}
https://cwiki.apache.org/PIG/avrostorage.html
AvroStorage checks the above limitations and throws exceptions when violated;
however, since #2 is checked before #1, we ends up with stack overflow if
schema is recursive. This can be avoided by changing the order of the checks so
that AvroStorage fails fast if schema is recursive.
I uploaded a patch that changes the order of the checks and adds two test cases
to TestAvroStorage to verify that proper exceptions are thrown for two cases.
My test can be run with the following commands:
{code}
tar -xf avro_test_files.tar.gz
ant clean compile-test piggybank -Dhadoopversion=20
cd contrib/piggybank/java
ant test -Dtestcase=TestAvroStorage
{code}
> AvroStorage throws StackOverFlowError
> -------------------------------------
>
> Key: PIG-2837
> URL: https://issues.apache.org/jira/browse/PIG-2837
> Project: Pig
> Issue Type: Bug
> Components: piggybank
> Affects Versions: 0.10.0
> Reporter: Mubarak Seyed
> Assignee: Cheolsoo Park
> Attachments: PIG-2837.patch, avro_test_files.tar.gz
>
>
> When i try to dump avro data using
> {code}
> records = LOAD '/logs/records/07262012/01/1/Record.1343265732700.avro' using
> org.apache.pig.piggybank.storage.avro.AvroStorage();
> dump records;
> {code}
> {code}
> Pig Stack Trace
> ---------------
> ERROR 2998: Unhandled internal error. null
> java.lang.StackOverflowError
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:258)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:262)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:271)
>
> at
> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.containsGenericUnion(AvroStorageUtils.java:284)
> {code}
> I did verify the avro schema using avro-tools and dump the data as json
> format, data looks good.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira