[
https://issues.apache.org/jira/browse/PIG-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626944#comment-13626944
]
Daniel Dai commented on PIG-3270:
---------------------------------
This should not fail. When we merge chararray and numerical field, the result
datatype is unknown (bytearray). Do an explain without MergeForEach (-t
MergeForEach):
{code}
tout: (Name: LOStore Schema: a#11:chararray,b#12:bytearray)
|
|---tout: (Name: LOUnion Schema: a#11:chararray,b#12:bytearray)
|
|---tout: (Name: LOForEach Schema: a#1:chararray,b#2:bytearray)
| | |
| | (Name: LOGenerate[false,false] Schema:
a#1:chararray,b#2:bytearray)ColumnPrune:InputUids=[1,
2]ColumnPrune:OutputUids=[1, 2]
| | | |
| | | a:(Name: Project Type: chararray Uid: 1 Input: 0 Column: 0)
| | | |
| | | (Name: Cast Type: bytearray Uid: 2) <------- Wrong case
| | | |
| | | |---b:(Name: Project Type: chararray Uid: 2 Input: 1 Column: 0)
| | |
| | |---(Name: LOInnerLoad[0] Schema: a#1:chararray)
| | |
| | |---(Name: LOInnerLoad[1] Schema: b#2:chararray)
| |
| |---t1: (Name: LOForEach Schema: a#1:chararray,b#2:chararray)
| | |
| | (Name: LOGenerate[false,false] Schema:
a#1:chararray,b#2:chararray)ColumnPrune:InputUids=[1,
2]ColumnPrune:OutputUids=[1, 2]
| | | |
| | | (Name: Cast Type: chararray Uid: 1)
| | | |
| | | |---a:(Name: Project Type: bytearray Uid: 1 Input: 0
Column: (*))
| | | |
| | | (Name: Cast Type: chararray Uid: 2)
| | | |
| | | |---b:(Name: Project Type: bytearray Uid: 2 Input: 1
Column: (*))
| | |
| | |---(Name: LOInnerLoad[0] Schema: a#1:bytearray)
| | |
| | |---(Name: LOInnerLoad[1] Schema: b#2:bytearray)
| |
| |---t1: (Name: LOLoad Schema:
a#1:bytearray,b#2:bytearray)RequiredFields:null
|
|---tout: (Name: LOForEach Schema: a#3:chararray,b#4:bytearray)
| |
| (Name: LOGenerate[false,false] Schema:
a#3:chararray,b#4:bytearray)ColumnPrune:InputUids=[3,
4]ColumnPrune:OutputUids=[3, 4]
| | |
| | a:(Name: Project Type: chararray Uid: 3 Input: 0 Column: 0)
| | |
| | (Name: Cast Type: bytearray Uid: 4) <------- Wrong case
| | |
| | |---b:(Name: Project Type: float Uid: 4 Input: 1 Column: 0)
| |
| |---(Name: LOInnerLoad[0] Schema: a#3:chararray)
| |
| |---(Name: LOInnerLoad[1] Schema: b#4:float)
|
|---t2: (Name: LOForEach Schema: a#3:chararray,b#4:float)
| |
| (Name: LOGenerate[false,false] Schema:
a#3:chararray,b#4:float)ColumnPrune:InputUids=[3, 4]ColumnPrune:OutputUids=[3,
4]
| | |
| | (Name: Cast Type: chararray Uid: 3)
| | |
| | |---a:(Name: Project Type: bytearray Uid: 3 Input: 0
Column: (*))
| | |
| | (Name: Cast Type: float Uid: 4)
| | |
| | |---b:(Name: Project Type: bytearray Uid: 4 Input: 1
Column: (*))
| |
| |---(Name: LOInnerLoad[0] Schema: a#3:bytearray)
| |
| |---(Name: LOInnerLoad[1] Schema: b#4:bytearray)
|
|---t2: (Name: LOLoad Schema:
a#3:bytearray,b#4:bytearray)RequiredFields:null
{code}
We should not insert cast to bytes operation. It's probably in
UnionOnSchemaSetter
> Union onschema failing at runtime when merging incompatible types
> -----------------------------------------------------------------
>
> Key: PIG-3270
> URL: https://issues.apache.org/jira/browse/PIG-3270
> Project: Pig
> Issue Type: Bug
> Reporter: Koji Noguchi
>
> {noformat}
> t1 = LOAD 'file1.txt' USING PigStorage() AS (a: chararray, b: chararray);
> t2 = LOAD 'file2.txt' USING PigStorage() AS (a: chararray, b: float);
> tout = UNION ONSCHEMA t1, t2;
> dump tout;
> {noformat}
> Job fails with
> 2013-04-09 11:37:37,817 [Thread-12] WARN
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
> java.lang.Exception: org.apache.pig.backend.executionengine.ExecException:
> ERROR 2055: Received Error while processing the map plan.
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:399)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2055:
> Received Error while processing the map plan.
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:311)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:231)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:680)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira