[jira] Commented: (PIG-770) Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS

2009-04-18 Thread George Mavromatis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700441#action_12700441
 ] 

George Mavromatis commented on PIG-770:
---

I can no longer reproduce it either!
I am wondering if the error messages were coming from other lines in the script 
.

I'll let this ticket open for some more days, just to make sure that no other 
variation of it exists, and if not I'll close it.

> Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS
> --
>
> Key: PIG-770
> URL: https://issues.apache.org/jira/browse/PIG-770
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: George Mavromatis
>
> Loading the 2 following as:
> urlContents = LOAD '$input' USING BinStorage() AS (url:chararray, 
> pg:bytearray);
> siteUrls = LOAD '$siteUrls' AS (site:chararray, score:double, 
> expanded_site:chararray, url:bytearray);
> then the following:
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>   FLATTEN(siteUrls.(site, expanded_site));
> works as expected.
> But all the rest fail with an error message that does not make sense (to me)
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>  FLATTEN(siteUrls.site);
> 2009-04-17 23:18:02,064 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray}
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>  FLATTEN(siteUrls.(site));
> 2009-04-17 23:19:27,669 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray}
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>   FLATTEN(siteUrls.(site,expanded_site)) 
> AS (site:chararray,expanded_site:chararray);
> 2009-04-17 23:23:33,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray,expanded_site: chararray}
> Even if I do not use the AS correctly with FLATTEN, then all or none of the 
> above should parse, so either way this is a parsing bug.
> Note that in the pig latin spec page, there is no formal description of 
> FLATTEN operation and no example where it is used with GENERATE, AS and a bag 
> of more than one tuples, so really I can't know if my above syntax is 
> supported, but try and guess. Should I file a separate ticket on that?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-770) Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS

2009-04-17 Thread George Mavromatis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700413#action_12700413
 ] 

George Mavromatis commented on PIG-770:
---

I omitted the cogroup statement, sorry, which is:

a = COGROUP urlContents BY url INNER, siteUrls BY url INNER;

Note that in my sample, siteUrls::url is chararray, not byterray, and that 
urlContents is loaded using BinStorage().

Can you reproduce with the above changes?

> Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS
> --
>
> Key: PIG-770
> URL: https://issues.apache.org/jira/browse/PIG-770
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: George Mavromatis
>
> Loading the 2 following as:
> urlContents = LOAD '$input' USING BinStorage() AS (url:chararray, 
> pg:bytearray);
> siteUrls = LOAD '$siteUrls' AS (site:chararray, score:double, 
> expanded_site:chararray, url:bytearray);
> then the following:
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>   FLATTEN(siteUrls.(site, expanded_site));
> works as expected.
> But all the rest fail with an error message that does not make sense (to me)
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>  FLATTEN(siteUrls.site);
> 2009-04-17 23:18:02,064 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray}
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>  FLATTEN(siteUrls.(site));
> 2009-04-17 23:19:27,669 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray}
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>   FLATTEN(siteUrls.(site,expanded_site)) 
> AS (site:chararray,expanded_site:chararray);
> 2009-04-17 23:23:33,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray,expanded_site: chararray}
> Even if I do not use the AS correctly with FLATTEN, then all or none of the 
> above should parse, so either way this is a parsing bug.
> Note that in the pig latin spec page, there is no formal description of 
> FLATTEN operation and no example where it is used with GENERATE, AS and a bag 
> of more than one tuples, so really I can't know if my above syntax is 
> supported, but try and guess. Should I file a separate ticket on that?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-770) Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS

2009-04-17 Thread Santhosh Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700386#action_12700386
 ] 

Santhosh Srinivasan commented on PIG-770:
-

I was not able to reproduce the problem. Please see my run that is similar:

{code}
grunt> urlContents = load 'input' as (url: chararray, pg);
grunt> siteUrls = load 'sites' as (site: chararray, score: double, 
expanded_site: chararray, url);
grunt> a = cogroup urlContents by $0, siteUrls by $0;
grunt> c = foreach a generate flatten(urlContents) as (url:chararray, pg), 
flatten(siteUrls.site);
grunt> c = foreach a generate flatten(urlContents) as (url:chararray, pg), 
flatten(siteUrls.(site));
grunt> c = foreach a generate flatten(urlContents) as (url:chararray, pg), 
flatten(siteUrls.(site, expanded_site));
{code}

> Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS
> --
>
> Key: PIG-770
> URL: https://issues.apache.org/jira/browse/PIG-770
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: George Mavromatis
>
> Loading the 2 following as:
> urlContents = LOAD '$input' USING BinStorage() AS (url:chararray, 
> pg:bytearray);
> siteUrls = LOAD '$siteUrls' AS (site:chararray, score:double, 
> expanded_site:chararray, url:bytearray);
> then the following:
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>   FLATTEN(siteUrls.(site, expanded_site));
> works as expected.
> But all the rest fail with an error message that does not make sense (to me)
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>  FLATTEN(siteUrls.site);
> 2009-04-17 23:18:02,064 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray}
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>  FLATTEN(siteUrls.(site));
> 2009-04-17 23:19:27,669 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray}
> urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, 
> pg:chararray),
>   FLATTEN(siteUrls.(site,expanded_site)) 
> AS (site:chararray,expanded_site:chararray);
> 2009-04-17 23:23:33,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: siteUrls::site in {url: 
> chararray,pg: chararray,site: chararray,expanded_site: chararray}
> Even if I do not use the AS correctly with FLATTEN, then all or none of the 
> above should parse, so either way this is a parsing bug.
> Note that in the pig latin spec page, there is no formal description of 
> FLATTEN operation and no example where it is used with GENERATE, AS and a bag 
> of more than one tuples, so really I can't know if my above syntax is 
> supported, but try and guess. Should I file a separate ticket on that?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.