[
https://issues.apache.org/jira/browse/PIG-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694832#comment-13694832
]
Julien Le Dem commented on PIG-3367:
------------------------------------
I was thinking we could make the syntax part of FOREACH.
{noformat}
B = FOREACH A GENERATE a, b, c ASSERT a >= 0, b IS NOT NULL;
{noformat}
That way it is easy to integrate asserts in the flow.
The advantage of having it part of the language:
- the error message can be clear without extra user input.
- it's more natural than doing a filter that does not filter. Also if the
filter is not in the predecessors of a STORE, it won't be executed.
A UDF can stop the job by throwing an exception. Although the task will retry
before failing completely.
For reference, the UDF based syntax:
{noformat}
FILTER members BY ASSERT( (member_id >= 0 ? 1 : 0), 'Doh! Some member ID is
negative.' );
{noformat}
Yes adding new keywords is inconvenient when the keyword was used for relation
or column names.
When a field collides with a keyword it is sometimes difficult to rename it.
I think we should:
- try to avoid new keywords if possible
- provide a mechanism to escape field names to facilitate fixing conflicts
when they happen (using quotes or a similar mechanism)
> Add assert keyword (operator) in pig
> ------------------------------------
>
> Key: PIG-3367
> URL: https://issues.apache.org/jira/browse/PIG-3367
> Project: Pig
> Issue Type: New Feature
> Components: parser
> Reporter: Aniket Mokashi
> Assignee: Aniket Mokashi
>
> Assert operator can be used for data validation. With assert you can write
> script as following-
> {code}
> a = load 'something' as (a0:int, a1:int);
> assert a by a0 > 0, 'a cant be negative for reasons';
> {code}
> This script will fail if assert is violated.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira