[ https://issues.apache.org/jira/browse/TEZ-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16006008#comment-16006008 ]
TezQA commented on TEZ-3708: ---------------------------- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12867500/TEZ-3708.4.patch against master revision dec7c1b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2429//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2429//console This message is automatically generated. > Improve parallelism and auto grouping of unpartitioned cartesian product > ------------------------------------------------------------------------ > > Key: TEZ-3708 > URL: https://issues.apache.org/jira/browse/TEZ-3708 > Project: Apache Tez > Issue Type: Sub-task > Reporter: Zhiyuan Yang > Assignee: Zhiyuan Yang > Attachments: TEZ-3708.1.patch, TEZ-3708.2.patch, TEZ-3708.3.patch, > TEZ-3708.4.patch > > > Current unpartitioned cartesian product has a few limitations > 1. parallelism can be not enough in case of large split and small # src task > 2. parallelism can be too much in in case of large # src task > 3. workload is not ideally distributed across the worker. Even with auto > grouping, grouping by size may not be accurate because same size can means > different #record and different cartesian product ops. -- This message was sent by Atlassian JIRA (v6.3.15#6346)