[ https://issues.apache.org/jira/browse/PIG-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510645#comment-13510645 ]
Joseph Adler commented on PIG-2684: ----------------------------------- I'm addressing this right now in PIG-3015. This isn't a bug; it's just a mismatch between the set of names that Avro allows and the names that Pig allows. (As a side note, there are good reasons why only some variable names are allowed in Avro: limiting the characters in names allows Avro to generate code to process Avro objects in a number of different languages. Colons in variable names would make it difficult to do this.) First, there are two workaround for this problem right now: - The user can rename variables before storing the bag - The user can manually specify the output schema Second, I don't like the idea of using namespaces for this. Namespaces are important for specific record types in Avro; they are translated by the protocol and schema compiles into package names for java classes. To make AvroStorage easier to user, I think it would make sense to add an option to AvroStorage to translate names with colons in some reasonable way: maybe translating the double colons to double underscores. > :: in field name causes AvroStorage to fail > ------------------------------------------- > > Key: PIG-2684 > URL: https://issues.apache.org/jira/browse/PIG-2684 > Project: Pig > Issue Type: Bug > Components: piggybank > Reporter: Fabian Alenius > > There appears to be a bug in AvroStorage which causes it to fail when there > are field names that contain :: > For example, the following will fail: > data = load 'test.txt' as (one, two); > grp = GROUP data by (one, two); > result = foreach grp generate FLATTEN(group); > > > store result into 'test.avro' using > org.apache.pig.piggybank.storage.avro.AvroStorage(); > ERROR 2999: Unexpected internal error. Illegal character in: group::one > While the following will succeed: > data = load 'test.txt' as (one, two); > grp = GROUP data by (one, two); > result = foreach grp generate FLATTEN(group) as (one,two); > > store result into 'test.avro' using > org.apache.pig.piggybank.storage.avro.AvroStorage(); > Here is a minimal test case: > data = load 'test.txt' as (one::two, three); > > > store data into 'test.avro' using > org.apache.pig.piggybank.storage.avro.AvroStorage(); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira