[
https://issues.apache.org/jira/browse/PIG-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108877#comment-13108877
]
[email protected] commented on PIG-2286:
----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1929/#review1974
-----------------------------------------------------------
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java
<https://reviews.apache.org/r/1929/#comment4462>
I think a comment will be useful -
// The algebraic udf can have more than one input. Add the udf only once
trunk/src/org/apache/pig/builtin/COR.java
<https://reviews.apache.org/r/1929/#comment4463>
The size of the tuple would need to be size*(size-1).
Details -
the inner loop is executed - (n-1) + (n-2) + .. (n - (n-1)) = n(n-1)/2 .
Each time the inner loop is executed two columns are being added. So 2 *
n(n-1)/2 = n(n-1)
trunk/src/org/apache/pig/builtin/COR.java
<https://reviews.apache.org/r/1929/#comment4464>
I don't understand why the values are being added to a tuple as columns.
That does not look right.
- Thejas
On 2011-09-16 18:11:08, Daniel Dai wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/1929/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-09-16 18:11:08)
bq.
bq.
bq. Review request for pig and Thejas Nair.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. See PIG-2286
bq.
bq.
bq. This addresses bug PIG-2286.
bq. https://issues.apache.org/jira/browse/PIG-2286
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. trunk/src/org/apache/pig/builtin/COR.java 1171325
bq.
trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java
1171325
bq. trunk/test/e2e/pig/tests/nightly.conf 1171325
bq.
bq. Diff: https://reviews.apache.org/r/1929/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. Unit-test:
bq. all pass
bq.
bq. Piggybank-test:
bq. TestDBStorage fail for other reason, unrelated to patch
bq.
bq. Test-patch:
bq. [exec] +1 overall.
bq. [exec]
bq. [exec] +1 @author. The patch does not contain any @author tags.
bq. [exec]
bq. [exec] +1 tests included. The patch appears to include 3 new or
modified tests.
bq. [exec]
bq. [exec] +1 javadoc. The javadoc tool did not generate any warning
messages.
bq. [exec]
bq. [exec] +1 javac. The applied patch does not increase the total
number of javac compiler warnings.
bq. [exec]
bq. [exec] +1 findbugs. The patch does not introduce any new
Findbugs warnings.
bq. [exec]
bq. [exec] +1 release audit. The applied patch does not increase the
total number of release audit warnings.
bq.
bq.
bq. Thanks,
bq.
bq. Daniel
bq.
bq.
> Using COR function in Piggybank results in ERROR 2018: Internal error. Unable
> to introduce the combiner for optimization
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: PIG-2286
> URL: https://issues.apache.org/jira/browse/PIG-2286
> Project: Pig
> Issue Type: Bug
> Components: impl, piggybank
> Affects Versions: 0.9.0
> Reporter: Viraj Bhat
> Assignee: Daniel Dai
> Attachments: PIG-2286-1.patch
>
>
> Usage of the COR function in a Pig script, results in an error. The
> "studenttab5" contains student, age and gpa separated by "tab".
> {code}
> register /home/viraj/pig-svn/trunk/contrib/piggybank/java/piggybank.jar;
> A = LOAD '/user/viraj/studenttab5' AS (name, age:double,gpa:double);
> B = group A all;
> C = foreach B generate group, COR(A.a, A.b);
> dump C;
> {code}
> {quote}
> 2011-09-14 17:03:22,001 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to hadoop file system at: hdfs://localhost:9000
> 2011-09-14 17:03:22,088 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to map-reduce job tracker at: localhost:9001
> 2011-09-14 17:03:22,960 [main] INFO
> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script:
> GROUP_BY
> 2011-09-14 17:03:23,168 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler -
> File concatenation threshold: 100 optimistic? false
> 2011-09-14 17:03:23,179 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer
> - Choosing to move algebraic foreach to combiner
> 2011-09-14 17:03:23,186 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
> 2018: Internal error. Unable to introduce the combiner for optimization.
> {quote}
> Viraj
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira