[ 
https://issues.apache.org/jira/browse/PIG-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605611#action_12605611
 ] 

Pi Song commented on PIG-161:
-----------------------------

How about:-

Accumulator takes bag as input and outputs datum. It can have optional 
arguments. A common argument would be column index (for example in SUM). This 
way 1) Accumulator doesn't have a strong notion of "state" 2) Don't need nested 
plan in accumulator

The inner plan in case 1 will look like this:-
{noformat}
plan 1: project(0)
plan 2: project(1) -> Accumulator(COUNT)
plan 3: project(1) -> Accumulator(SUM, 1)
{noformat}
case 2 looks like this:-
{noformat}
plan 1: project(0)
plan 2: project(1) -> distinct -> Accumulator(COUNT, c1)
plan 3: project(1) -> filter -> Accumulator(SUM, 1)
{noformat}
Santhosh's case looks like this:-
{noformat}
plan 1: project(1) -> distinct -> Accumulator(SUM, 1) \
                                                      SUM()
                                  project(0)---------/
{noformat}

We don't do SUM(C2.$1*C2.$2) so this solution should be ok. 

Regarding "Nested plans which are DAGs and not trees (which are quite common) 
are hard to handle", I think if we talk about implementation we can always 
extract trees from DAGs by doing depth-first walk. You just have to keep 
mapping between output ports of the nested plan and expected columns.


> Rework physical plan
> --------------------
>
>                 Key: PIG-161
>                 URL: https://issues.apache.org/jira/browse/PIG-161
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: arithmeticOperators.patch, BinCondAndNegative.patch, 
> CastAndMapLookUp.patch, incr2.patch, incr3.patch, incr4.patch, incr5.patch, 
> logToPhyTranslator.patch, missingOps.patch, 
> MRCompilerTests_PlansAndOutputs.txt, Phy_AbsClass.patch, physicalOps.patch, 
> physicalOps.patch, physicalOps.patch, physicalOps.patch, 
> physicalOps_latest.patch, POCast.patch, POCast.patch, podistinct.patch, 
> pogenerate.patch, pogenerate.patch, pogenerate.patch, posort.patch, 
> POUserFuncCorrection.patch, 
> TEST-org.apache.pig.test.TestLocalJobSubmission.txt, 
> TEST-org.apache.pig.test.TestLogToPhyCompiler.txt, 
> TEST-org.apache.pig.test.TestLogToPhyCompiler.txt, 
> TEST-org.apache.pig.test.TestMapReduce.txt, 
> TEST-org.apache.pig.test.TestTypeCheckingValidator.txt, 
> TEST-org.apache.pig.test.TestUnion.txt, translator.patch, translator.patch, 
> translator.patch, translator.patch
>
>
> This bug tracks work to rework all of the physical operators as described in 
> http://wiki.apache.org/pig/PigTypesFunctionalSpec

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to