lineage tracking for casting should compare LoadCaster returned from LoadFunc
instead of comparing the FuncSpec
----------------------------------------------------------------------------------------------------------------
Key: PIG-2023
URL: https://issues.apache.org/jira/browse/PIG-2023
Project: Pig
Issue Type: Improvement
Reporter: Thejas M Nair
Fix For: 0.10
When lineage of a column is tracked for the purpose of finding the LoadCaster
associated with a column, and it finds that a column has two possible sources,
it associates a LoadCaster (through a LoadFunc) only if the funcspec for
LoadFunc in both cases are the same. But it is possible that the two LoadFunc
with different func spec actually use the same LoadCaster (for example the
default of Utf8StorageConverter). If the LoadFunc funcspec don't match, the
LoadCaster returned by the LoadFunc should also be compred. If they are equal,
this LoadCaster should be associated with the column . The LoadCaster
implementation would need to override equals().
For example, in this case the columns in relation u use the same LoadCaster -
{code}
l1 = load 'x' using PigStorage(',') as (a,b);
l2 = load 'y' using PigStorage(':') as (a,b);
u = union l1,l2;
{code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira