[ 
https://issues.apache.org/jira/browse/PIG-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mitesh Singh Jat updated PIG-1830:
----------------------------------

    Description: 
Pig fails when we try to GROUP data loaded via PigStorageSchema.

{code}
Events = LOAD 'input/PigStorageSchema' USING 
org.apache.pig.piggybank.storage.PigStorageSchema();

Sessions = GROUP Events BY name;

DUMP Sessions;
{code}

Schema file '''input/PigStorageSchema/.pig_schema'''
{code}
{"fields":[{"name":"name","type":55,"schema":null,"description":"autogenerated 
from Pig Field 
Schema"},{"name":"val","type":10,"schema":null,"description":"autogenerated 
from Pig Field Schema"}],"version":0,"sortKeys":[],"sortKeyOrders":[]}
{code}

Sample input file '''input/PigStorageSchema/pss.in'''
{code}
peter   1
samir   3
michael 4
peter   2
peter   4
samir   1
{code}

On running the above pig script, the following error is received.

{code}
2010-12-15 08:07:58,367 WARN org.apache.hadoop.mapred.Child: Error running child
java.io.IOException: Type mismatch in key from map: expected 
org.apache.pig.impl.io.NullableText, recieved
org.apache.pig.impl.io.NullableBytesWritable
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:898)
        at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:600)
        at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:674)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:242)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
        at org.apache.hadoop.mapred.Child.main(Child.java:236)
{code}

On changing "type" of "name" from 55(chararray) to 50(bytearray), the
GROUP-BY worked.


  was:
Pig fails when we try to GROUP data loaded via PigStorageSchema.

{code lang=java}
Events = LOAD 'input/PigStorageSchema' USING 
org.apache.pig.piggybank.storage.PigStorageSchema();

Sessions = GROUP Events BY name;

DUMP Sessions;
{code}

Schema file '''input/PigStorageSchema/.pig_schema'''
{code lang=java}
{"fields":[{"name":"name","type":55,"schema":null,"description":"autogenerated 
from Pig Field 
Schema"},{"name":"val","type":10,"schema":null,"description":"autogenerated 
from Pig Field Schema"}],"version":0,"sortKeys":[],"sortKeyOrders":[]}
{code}

Sample input file '''input/PigStorageSchema/pss.in'''
{code lang=java}
peter   1
samir   3
michael 4
peter   2
peter   4
samir   1
{code}

On running the above pig script, the following error is received.

{code lang=java}
2010-12-15 08:07:58,367 WARN org.apache.hadoop.mapred.Child: Error running child
java.io.IOException: Type mismatch in key from map: expected 
org.apache.pig.impl.io.NullableText, recieved
org.apache.pig.impl.io.NullableBytesWritable
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:898)
        at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:600)
        at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:674)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:242)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
        at org.apache.hadoop.mapred.Child.main(Child.java:236)
{code}

On changing "type" of "name" from 55(chararray) to 50(bytearray), the
GROUP-BY worked.



> Type mismatch error in key from map, when doing GROUP on PigStorageSchema() 
> variable
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-1830
>                 URL: https://issues.apache.org/jira/browse/PIG-1830
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Mitesh Singh Jat
>
> Pig fails when we try to GROUP data loaded via PigStorageSchema.
> {code}
> Events = LOAD 'input/PigStorageSchema' USING 
> org.apache.pig.piggybank.storage.PigStorageSchema();
> Sessions = GROUP Events BY name;
> DUMP Sessions;
> {code}
> Schema file '''input/PigStorageSchema/.pig_schema'''
> {code}
> {"fields":[{"name":"name","type":55,"schema":null,"description":"autogenerated
>  from Pig Field 
> Schema"},{"name":"val","type":10,"schema":null,"description":"autogenerated 
> from Pig Field Schema"}],"version":0,"sortKeys":[],"sortKeyOrders":[]}
> {code}
> Sample input file '''input/PigStorageSchema/pss.in'''
> {code}
> peter   1
> samir   3
> michael 4
> peter   2
> peter   4
> samir   1
> {code}
> On running the above pig script, the following error is received.
> {code}
> 2010-12-15 08:07:58,367 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.io.IOException: Type mismatch in key from map: expected 
> org.apache.pig.impl.io.NullableText, recieved
> org.apache.pig.impl.io.NullableBytesWritable
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:898)
>         at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:600)
>         at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:674)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:242)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
>         at org.apache.hadoop.mapred.Child.main(Child.java:236)
> {code}
> On changing "type" of "name" from 55(chararray) to 50(bytearray), the
> GROUP-BY worked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to