[jira] [Commented] (PIG-3000) Optimize nested foreach

2016-04-26 Thread Chon Ju Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258233#comment-15258233
 ] 

Chon Ju Kim commented on PIG-3000:
--

I encountered this issue with a little bit different code in our project. Here 
is a code snippet.

{code} B = FOREACH A {
   a = foo();
   b = SUM(a.x);
   GENERATE a, b, (t is null ? c : d);
 } {code}

foo is called twice. Note that t is defined outside of the foreach.

> Optimize nested foreach
> ---
>
> Key: PIG-3000
> URL: https://issues.apache.org/jira/browse/PIG-3000
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.10.0
>Reporter: Richard Ding
>Assignee: Mona Chitnis
> Attachments: PIG-3000-6.patch, unit_tests.patch
>
>
> In this Pig script:
> {code}
> A = load 'data' as (a:chararray);
> B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
> ? 1 : 0); }
> {code}
> The Eval function UPPER is called twice for each record.
> This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-3000) Optimize nested foreach

2016-03-21 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205160#comment-15205160
 ] 

Daniel Dai commented on PIG-3000:
-

I don't think [~chitnis] is working on that. We will need to find a new owner 
for the issue.

> Optimize nested foreach
> ---
>
> Key: PIG-3000
> URL: https://issues.apache.org/jira/browse/PIG-3000
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.10.0
>Reporter: Richard Ding
>Assignee: Mona Chitnis
> Attachments: PIG-3000-6.patch, unit_tests.patch
>
>
> In this Pig script:
> {code}
> A = load 'data' as (a:chararray);
> B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
> ? 1 : 0); }
> {code}
> The Eval function UPPER is called twice for each record.
> This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-3000) Optimize nested foreach

2016-03-21 Thread Kevin J. Price (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15204921#comment-15204921
 ] 

Kevin J. Price commented on PIG-3000:
-

Did this patch just get dropped? This is still a serious problem.

> Optimize nested foreach
> ---
>
> Key: PIG-3000
> URL: https://issues.apache.org/jira/browse/PIG-3000
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.10.0
>Reporter: Richard Ding
>Assignee: Mona Chitnis
> Attachments: PIG-3000-6.patch, unit_tests.patch
>
>
> In this Pig script:
> {code}
> A = load 'data' as (a:chararray);
> B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
> ? 1 : 0); }
> {code}
> The Eval function UPPER is called twice for each record.
> This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-3000) Optimize nested foreach

2014-06-18 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036599#comment-14036599
 ] 

Mona Chitnis commented on PIG-3000:
---

thanks for taking a peek Daniel. I will rebase my patch to trunk

 Optimize nested foreach
 ---

 Key: PIG-3000
 URL: https://issues.apache.org/jira/browse/PIG-3000
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10.0
Reporter: Richard Ding
Assignee: Mona Chitnis
 Attachments: PIG-3000-6.patch, unit_tests.patch


 In this Pig script:
 {code}
 A = load 'data' as (a:chararray);
 B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
 ? 1 : 0); }
 {code}
 The Eval function UPPER is called twice for each record.
 This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3000) Optimize nested foreach

2014-04-01 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13957211#comment-13957211
 ] 

Daniel Dai commented on PIG-3000:
-

Hi, Mona, your last patch does not include the changes other than 
NestedForEachUserFunc.java, is it based on early patches?

 Optimize nested foreach
 ---

 Key: PIG-3000
 URL: https://issues.apache.org/jira/browse/PIG-3000
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10.0
Reporter: Richard Ding
Assignee: Mona Chitnis
 Attachments: PIG-3000-6.patch, unit_tests.patch


 In this Pig script:
 {code}
 A = load 'data' as (a:chararray);
 B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
 ? 1 : 0); }
 {code}
 The Eval function UPPER is called twice for each record.
 This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3000) Optimize nested foreach

2014-03-28 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13951028#comment-13951028
 ] 

Mona Chitnis commented on PIG-3000:
---

Updated rev-5 on RB:
Working code for simple case - nested foreach userfunc loading single argument 
and generate operating on same argument.
commented out code is to make it work for complex cases 
1. multiple arguments
2. userfunc having a subset of generate arguments - how to pass through from 
initial load to initial foreach

 Optimize nested foreach
 ---

 Key: PIG-3000
 URL: https://issues.apache.org/jira/browse/PIG-3000
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10.0
Reporter: Richard Ding
Assignee: Mona Chitnis
 Attachments: unit_tests.patch


 In this Pig script:
 {code}
 A = load 'data' as (a:chararray);
 B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
 ? 1 : 0); }
 {code}
 The Eval function UPPER is called twice for each record.
 This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3000) Optimize nested foreach

2014-03-25 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13947434#comment-13947434
 ] 

Mona Chitnis commented on PIG-3000:
---

[~daijy] Daniel Dai, can you please review the patch on reviewboard (revision 
4)? https://reviews.apache.org/r/17376/diff/4/

I have described the stage I've reached with it.
thanks

 Optimize nested foreach
 ---

 Key: PIG-3000
 URL: https://issues.apache.org/jira/browse/PIG-3000
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10.0
Reporter: Richard Ding
Assignee: Mona Chitnis
 Attachments: unit_tests.patch


 In this Pig script:
 {code}
 A = load 'data' as (a:chararray);
 B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
 ? 1 : 0); }
 {code}
 The Eval function UPPER is called twice for each record.
 This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PIG-3000) Optimize nested foreach

2014-02-14 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902256#comment-13902256
 ] 

Mona Chitnis commented on PIG-3000:
---

Patch updated to RB.

Patch updated to handle the Projection with nothing to reference issue which 
was coming from the innerLoad of the altered ForEach.

Doing an explain on new plan gives correct new optimized plan. The commented 
out part in the patch is because I observed that this was getting automatically 
done by SchemaPatcher and ProjectionPatcher listeners in the 
LogicalPlanOptimizer. However, this gives variable results for the uids and 
following error - 
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: 
Couldn't find matching uid -1 for project 
org.apache.pig.builtin.upper_17:(Name: Project Type: chararray Uid: 38 Input: 0 
Column: 0)
at 
org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91)
at 
org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:215)

(Where upper_17 is an example unique alias generated for the UserFuncExpression 
operator in new plan)


Any help is appreciated. This patch excludes unit tests and will upload all in 
next patch after fixing this issue.

 Optimize nested foreach
 ---

 Key: PIG-3000
 URL: https://issues.apache.org/jira/browse/PIG-3000
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10.0
Reporter: Richard Ding
Assignee: Mona Chitnis

 In this Pig script:
 {code}
 A = load 'data' as (a:chararray);
 B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
 ? 1 : 0); }
 {code}
 The Eval function UPPER is called twice for each record.
 This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (PIG-3000) Optimize nested foreach

2014-02-03 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889911#comment-13889911
 ] 

Mona Chitnis commented on PIG-3000:
---

Can someone assign this JIRA to me?

 Optimize nested foreach
 ---

 Key: PIG-3000
 URL: https://issues.apache.org/jira/browse/PIG-3000
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10.0
Reporter: Richard Ding

 In this Pig script:
 {code}
 A = load 'data' as (a:chararray);
 B = foreach A { c = UPPER(a); generate ((c eq 'TEST') ? 1 : 0), ((c eq 'DEV') 
 ? 1 : 0); }
 {code}
 The Eval function UPPER is called twice for each record.
 This should be optimized so that the UPPER is called only once for each record



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)