[
https://issues.apache.org/jira/browse/PIG-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Santhosh Srinivasan updated PIG-505:
------------------------------------
Attachment: PIG-505.patch
The attached patch (PIG-505.patch) addresses the following issues:
1. The lineage code does not error out when bytearrays are seen from a UDF.
Instead the error message is pushed to runtime if a bytearray originating from
a UDF is cast to a Pig type
2. Moved the SchemaUitils class out of the parser and into the Schema class
3. Added unit test cases for the lineage change
All unit test cases pass.
> Lineage for UDFs that do not return bytearray
> ---------------------------------------------
>
> Key: PIG-505
> URL: https://issues.apache.org/jira/browse/PIG-505
> Project: Pig
> Issue Type: Bug
> Affects Versions: types_branch
> Reporter: Santhosh Srinivasan
> Assignee: Santhosh Srinivasan
> Fix For: types_branch
>
> Attachments: PIG-505.patch
>
>
> In Pig-335, the lineage design states that UDFs that return bytearrays could
> cause problems in tracing the lineage. For UDFs that do not return bytearray,
> the lineage design should pickup the right load function to use as long as
> there is no ambiguity. In the current implementation, we could have issues
> with scripts like:
> {code}
> a = load 'input' as (field1);
> b = foreach a generate myudf_to_double(field1);
> c = foreach b generate $0 + 2.0;
> {code}
> When $0 has to be cast to a double, the lineage code will complain that it
> hit a UDF and hence cannot determine the right load function to use.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.