[
https://issues.apache.org/jira/browse/DATAFU-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106894#comment-16106894
]
Eyal Allweil commented on DATAFU-83:
------------------------------------
Hi Kyle ([~ItsAUsernameRight?])
Your help is very welcome. I have two comments about the state of the
contribution - I'll put them both here and in the review board for maximum
visibility.
1. I think the output schema of this UDF is always boolean, not the schema of
the first input field. I would make the outputSchema method identical to that
in an existing Boolean UDF - for example, [Pig's ENDSWITH built-in
function|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/builtin/ENDSWITH.java#L62]
2. As Matthew already wrote in the review board, adding a case to the unit test
is a good idea - you can probably just duplicate something from [the existing
test|https://github.com/apache/incubator-datafu/blob/master/datafu-pig/src/test/java/datafu/test/pig/util/InTests.java].
Thanks!
> InUDF does not validate that types are compatible
> -------------------------------------------------
>
> Key: DATAFU-83
> URL: https://issues.apache.org/jira/browse/DATAFU-83
> Project: DataFu
> Issue Type: Improvement
> Reporter: Matthew Hayes
> Priority: Minor
> Attachments: DATAFU-83.patch, rb36702.patch
>
>
> See the example below. The input data is a long, but ints are provided to
> match against. Because it uses the Java equals to compare and these are
> different types, this will never match, which can lead to confusing results.
> I believe it should at least throw an error.
> {code}
> define I datafu.pig.util.InUDF();
>
> data = LOAD 'input' AS (B: bag {T: tuple(v:LONG)});
>
> data2 = FOREACH data {
> C = FILTER B By I(v, 1,2,3);
> GENERATE C;
> }
>
> describe data2;
>
> STORE data2 INTO 'output';
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)