Great, thanks !
I am assuming this might also fix load related schema issues too (with
BinStorage) ? Looked kind of similar issue as I reported in pig usergroup.
- Mridul
Pradeep Kamath (JIRA) wrote:
[
https://issues.apache.org/jira/browse/PIG-690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pradeep Kamath updated PIG-690:
-------------------------------
Fix Version/s: types_branch
Assignee: Pradeep Kamath
Affects Version/s: types_branch
Status: Patch Available (was: Open)
The root cause of the issue is while merging schemas, the code recursively merges subschemas if a field is a tuple or a bag. At that point, it does not properly attribute the type to be bag if that was the case. It always marks the type as tuple whenever the field schema is of type bag or tuple. This is fixed in the patch and a unit test case has been added which tries to union two relations which have a bag field.
UNION doesn't work in the latest code
-------------------------------------
Key: PIG-690
URL: https://issues.apache.org/jira/browse/PIG-690
Project: Pig
Issue Type: Bug
Affects Versions: types_branch
Environment: mapred mode. local mode.has the same problem under linux.
code is taken from trunk
Reporter: Amir Youssefi
Assignee: Pradeep Kamath
Fix For: types_branch
Attachments: PIG-690.patch
grunt> a = load 'tmp/f1' using BinStorage();
grunt> b = load 'tmp/f2' using BinStorage();
grunt> describe a;
a: {int,chararray,int,{(int,chararray,chararray)}}
grunt> describe b;
b: {int,chararray,int,{(int,chararray,chararray)}}
grunt> c = union a,b;
grunt> describe c;
2009-02-27 11:51:46,012 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
1052: Cannot cast bag with schema bag({(int,chararray,chararray)}) to tuple
with schema tuple
Details at logfile: /homes/amiry/pig_1235735380348.log
dump a and dump b work fine.
Sample data provided to dev team in an e-mail.