[
https://issues.apache.org/jira/browse/PIG-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15311296#comment-15311296
]
Daniel Dai commented on PIG-2315:
---------------------------------
+1.
Also note there is a performance regression in some cases. For example:
{code}
crawl = load 'webcrawl' as (url, pageid);
extracted = foreach crawl generate flatten(REGEX_EXTRACT_ALL(url,
'(http|https)://(.*?)/(.*)')) as (protocol:chararray, host:chararray,
path:chararray);
{code}
Here the users just try to give additional information to Pig since
REGEX_EXTRACT_ALL didn't declare types inside tuple and not intend to cast.
With the change, Pig force a cast and there is no way to avoid that. The
performance hit should be small and I believe it worth to clarify the syntax.
> Make as clause work in generate
> -------------------------------
>
> Key: PIG-2315
> URL: https://issues.apache.org/jira/browse/PIG-2315
> Project: Pig
> Issue Type: Bug
> Reporter: Olga Natkovich
> Assignee: Daniel Dai
> Fix For: 0.17.0
>
> Attachments: PIG-2315-1-rebase.patch, PIG-2315-1.patch,
> PIG-2315-1.patch, pig-2315-2-after-rebase.patch, pig-2315-3-merged.patch
>
>
> Currently, the following syntax is supported and ignored causing confusing
> with users:
> A1 = foreach A1 generate a as a:chararray ;
> After this statement a just retains its previous type
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)