[ 
https://issues.apache.org/jira/browse/PIG-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452778#comment-13452778
 ] 

Cheolsoo Park commented on PIG-2445:
------------------------------------

AvroStorage can store two relations in one script. In fact, there was the same 
question to user group a while ago. I am copying my answer here:
{quote}
The AvroStorage has very funny syntax regarding multiple stores. To apply 
different avro schemas to multiple stores, you have to specify their "index" as 
follows:

set1 = load 'input1.txt' using PigStorage('|') as ( ... );
store set1 into 'set1' using 
org.apache.pig.piggybank.storage.avro.AvroStorage('index', '1');

set2 = load 'input2.txt' using PigStorage('|') as ( .. );
store set2 into 'set2' using 
org.apache.pig.piggybank.storage.avro.AvroStorage('index', '2');

As can be seen, I added the 'index' parameters.

What AvroStorage does is to construct the following string in the frontend:

"1#<1st avro schema>,2#<2nd avro schema>"

and pass it to backend via UdfContext. Now in backend, tasks parse this string 
to get output schema for each store. 
{quote}

This is also documented at the [AvroStorage 
wiki|https://cwiki.apache.org/PIG/avrostorage.html#AvroStorage-GlobalParameters].
 (Please see "index".) Obviously, this is not very intuitive, so I was thinking 
of writing a new AvroStorage with more intuitive options although I haven't 
started yet.

I think that we should close this jira. Please let me know if anyone has 
objections.

Thanks!
                
> AvroStorage can't store two relations in one script
> ---------------------------------------------------
>
>                 Key: PIG-2445
>                 URL: https://issues.apache.org/jira/browse/PIG-2445
>             Project: Pig
>          Issue Type: New Feature
>          Components: piggybank
>    Affects Versions: 0.9.1, 0.9.2, 0.10.0
>            Reporter: Russell Jurney
>              Labels: avro, fun, happy, pants, pig, pig_udf, storefunc
>
> STORE one INTO '/tmp/one.avro' USING AvroStorage();
> STORE two INTO '/tmp/two.avro' USING AvroStorage();
> -- relation two has the schema of relation one.  BANG!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to