[ https://issues.apache.org/jira/browse/PIG-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861097#action_12861097 ]
Viraj Bhat commented on PIG-798: -------------------------------- Hi Ashutosh, Yes that is possible, I know that we can do that in PigStorage() but why can we not do this in PigStorage? What do I need to cast as (chararray) ? {code} A = load 'somedata' using PigStorage(); B = foreach A generate $0 as name:chararray; dump B; {code} But this is possible in BinStorage(), why is this not consistent? Is it that BinStorage() has schemas embedded while PigStorage() does not? Should this not be fixed to make it consistent across storage formats? Viraj > Schema errors when using PigStorage and none when using BinStorage in > FOREACH?? > ------------------------------------------------------------------------------- > > Key: PIG-798 > URL: https://issues.apache.org/jira/browse/PIG-798 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.2.0, 0.3.0, 0.4.0, 0.5.0, 0.6.0, 0.7.0, 0.8.0 > Reporter: Viraj Bhat > Attachments: binstoragecreateop, schemaerr.pig, visits.txt > > > In the following script I have a tab separated text file, which I load using > PigStorage() and store using BinStorage() > {code} > A = load '/user/viraj/visits.txt' using PigStorage() as (name:chararray, > url:chararray, time:chararray); > B = group A by name; > store B into '/user/viraj/binstoragecreateop' using BinStorage(); > dump B; > {code} > I later load file 'binstoragecreateop' in the following way. > {code} > A = load '/user/viraj/binstoragecreateop' using BinStorage(); > B = foreach A generate $0 as name:chararray; > dump B; > {code} > Result > ======================================================================= > (Amy) > (Fred) > ======================================================================= > The above code work properly and returns the right results. If I use > PigStorage() to achieve the same, I get the following error. > {code} > A = load '/user/viraj/visits.txt' using PigStorage(); > B = foreach A generate $0 as name:chararray; > dump B; > {code} > ======================================================================= > {code} > 2009-05-02 03:58:50,662 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1022: Type mismatch merging schema prefix. Field Schema: bytearray. Other > Field Schema: name: chararray > Details at logfile: /home/viraj/pig-svn/trunk/pig_1241236728311.log > {code} > ======================================================================= > So why should the semantics of BinStorage() be different from PigStorage() > where is ok not to specify a schema??? Should it not be consistent across > both. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.