[ 
https://issues.apache.org/jira/browse/PIG-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988207#action_12988207
 ] 

Olga Natkovich commented on PIG-1830:
-------------------------------------

The problem is with PigStorageSchema implementation. The class extends 
PigStorage without overwriting getNext.
So, while the schema tells Pig that the data is coming as chararray, the data 
is actually created (by PigStorage)
as bytearray.

The owner of the PigStorageSchema function needs to make sure that the data and 
schema types match. 

> Type mismatch error in key from map, when doing GROUP on PigStorageSchema() 
> variable
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-1830
>                 URL: https://issues.apache.org/jira/browse/PIG-1830
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Mitesh Singh Jat
>
> Pig fails when we try to GROUP data loaded via PigStorageSchema.
> {code}
> Events = LOAD 'input/PigStorageSchema' USING 
> org.apache.pig.piggybank.storage.PigStorageSchema();
> Sessions = GROUP Events BY name;
> DUMP Sessions;
> {code}
> Schema file '''input/PigStorageSchema/.pig_schema'''
> {code}
> {"fields":[{"name":"name","type":55,"schema":null,"description":"autogenerated
>  from Pig Field 
> Schema"},{"name":"val","type":10,"schema":null,"description":"autogenerated 
> from Pig Field Schema"}],"version":0,"sortKeys":[],"sortKeyOrders":[]}
> {code}
> Header file '''input/PigStorageSchema/.pig_header'''
> {code}
> name    val
> {code}
> Sample input file '''input/PigStorageSchema/pss.in'''
> {code}
> peter   1
> samir   3
> michael 4
> peter   2
> peter   4
> samir   1
> {code}
> On running the above pig script, the following error is received.
> {code}
> 2010-12-15 08:07:58,367 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.io.IOException: Type mismatch in key from map: expected 
> org.apache.pig.impl.io.NullableText, recieved
> org.apache.pig.impl.io.NullableBytesWritable
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:898)
>         at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:600)
>         at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:674)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:242)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
>         at org.apache.hadoop.mapred.Child.main(Child.java:236)
> {code}
> On changing "type" of "name" from 55(chararray) to 50(bytearray), the
> GROUP-BY worked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to