[jira] Commented: (PIG-770) Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS
[ https://issues.apache.org/jira/browse/PIG-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700441#action_12700441 ] George Mavromatis commented on PIG-770: --- I can no longer reproduce it either! I am wondering if the error messages were coming from other lines in the script . I'll let this ticket open for some more days, just to make sure that no other variation of it exists, and if not I'll close it. > Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS > -- > > Key: PIG-770 > URL: https://issues.apache.org/jira/browse/PIG-770 > Project: Pig > Issue Type: Bug >Affects Versions: 0.2.0 >Reporter: George Mavromatis > > Loading the 2 following as: > urlContents = LOAD '$input' USING BinStorage() AS (url:chararray, > pg:bytearray); > siteUrls = LOAD '$siteUrls' AS (site:chararray, score:double, > expanded_site:chararray, url:bytearray); > then the following: > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site, expanded_site)); > works as expected. > But all the rest fail with an error message that does not make sense (to me) > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.site); > 2009-04-17 23:18:02,064 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray} > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site)); > 2009-04-17 23:19:27,669 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray} > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site,expanded_site)) > AS (site:chararray,expanded_site:chararray); > 2009-04-17 23:23:33,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray,expanded_site: chararray} > Even if I do not use the AS correctly with FLATTEN, then all or none of the > above should parse, so either way this is a parsing bug. > Note that in the pig latin spec page, there is no formal description of > FLATTEN operation and no example where it is used with GENERATE, AS and a bag > of more than one tuples, so really I can't know if my above syntax is > supported, but try and guess. Should I file a separate ticket on that? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-770) Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS
[ https://issues.apache.org/jira/browse/PIG-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700413#action_12700413 ] George Mavromatis commented on PIG-770: --- I omitted the cogroup statement, sorry, which is: a = COGROUP urlContents BY url INNER, siteUrls BY url INNER; Note that in my sample, siteUrls::url is chararray, not byterray, and that urlContents is loaded using BinStorage(). Can you reproduce with the above changes? > Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS > -- > > Key: PIG-770 > URL: https://issues.apache.org/jira/browse/PIG-770 > Project: Pig > Issue Type: Bug >Affects Versions: 0.2.0 >Reporter: George Mavromatis > > Loading the 2 following as: > urlContents = LOAD '$input' USING BinStorage() AS (url:chararray, > pg:bytearray); > siteUrls = LOAD '$siteUrls' AS (site:chararray, score:double, > expanded_site:chararray, url:bytearray); > then the following: > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site, expanded_site)); > works as expected. > But all the rest fail with an error message that does not make sense (to me) > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.site); > 2009-04-17 23:18:02,064 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray} > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site)); > 2009-04-17 23:19:27,669 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray} > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site,expanded_site)) > AS (site:chararray,expanded_site:chararray); > 2009-04-17 23:23:33,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray,expanded_site: chararray} > Even if I do not use the AS correctly with FLATTEN, then all or none of the > above should parse, so either way this is a parsing bug. > Note that in the pig latin spec page, there is no formal description of > FLATTEN operation and no example where it is used with GENERATE, AS and a bag > of more than one tuples, so really I can't know if my above syntax is > supported, but try and guess. Should I file a separate ticket on that? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-770) Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS
[ https://issues.apache.org/jira/browse/PIG-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700386#action_12700386 ] Santhosh Srinivasan commented on PIG-770: - I was not able to reproduce the problem. Please see my run that is similar: {code} grunt> urlContents = load 'input' as (url: chararray, pg); grunt> siteUrls = load 'sites' as (site: chararray, score: double, expanded_site: chararray, url); grunt> a = cogroup urlContents by $0, siteUrls by $0; grunt> c = foreach a generate flatten(urlContents) as (url:chararray, pg), flatten(siteUrls.site); grunt> c = foreach a generate flatten(urlContents) as (url:chararray, pg), flatten(siteUrls.(site)); grunt> c = foreach a generate flatten(urlContents) as (url:chararray, pg), flatten(siteUrls.(site, expanded_site)); {code} > Parsing errors with FOREACH a GENERATE FLATTEN(urlContents) AS > -- > > Key: PIG-770 > URL: https://issues.apache.org/jira/browse/PIG-770 > Project: Pig > Issue Type: Bug >Affects Versions: 0.2.0 >Reporter: George Mavromatis > > Loading the 2 following as: > urlContents = LOAD '$input' USING BinStorage() AS (url:chararray, > pg:bytearray); > siteUrls = LOAD '$siteUrls' AS (site:chararray, score:double, > expanded_site:chararray, url:bytearray); > then the following: > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site, expanded_site)); > works as expected. > But all the rest fail with an error message that does not make sense (to me) > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.site); > 2009-04-17 23:18:02,064 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray} > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site)); > 2009-04-17 23:19:27,669 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray} > urlContentsByUrl = FOREACH a GENERATE FLATTEN(urlContents) AS (url:chararray, > pg:chararray), > FLATTEN(siteUrls.(site,expanded_site)) > AS (site:chararray,expanded_site:chararray); > 2009-04-17 23:23:33,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1000: Error during parsing. Invalid alias: siteUrls::site in {url: > chararray,pg: chararray,site: chararray,expanded_site: chararray} > Even if I do not use the AS correctly with FLATTEN, then all or none of the > above should parse, so either way this is a parsing bug. > Note that in the pig latin spec page, there is no formal description of > FLATTEN operation and no example where it is used with GENERATE, AS and a bag > of more than one tuples, so really I can't know if my above syntax is > supported, but try and guess. Should I file a separate ticket on that? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.