[ 
https://issues.apache.org/jira/browse/DRILL-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177027#comment-15177027
 ] 

ASF GitHub Bot commented on DRILL-4446:
---------------------------------------

Github user laurentgo commented on a diff in the pull request:

    https://github.com/apache/drill/pull/403#discussion_r54828997
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/fragment/DistributionAffinity.java
 ---
    @@ -0,0 +1,61 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p/>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p/>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.planner.fragment;
    +
    +/**
    + * Describes an operator's endpoint assignment requirements.
    + */
    +public enum DistributionAffinity {
    +  /**
    +   * No affinity to any endpoints. Operator can run on any endpoint.
    +   */
    +  NONE(0, SoftAffinityFragmentParallelizer.INSTANCE),
    +
    +  /**
    +   * Operator has soft distribution affinity to one or more endpoints. 
Operator performs better when fragments are
    +   * assigned to the endpoints with affinity, but not a mandatory 
requirement.
    +   */
    +  SOFT(1, SoftAffinityFragmentParallelizer.INSTANCE),
    +
    +  /**
    +   * Hard distribution affinity to one or more endpoints. Fragments having 
the operator must be scheduled on the nodes
    +   * with affinity.
    +   */
    +  HARD(2, HardAffinityFragmentParallelizer.INSTANCE);
    +
    +  private int level;
    +  private FragmentParallelizer fragmentParallelizer;
    +
    +  DistributionAffinity(final int level, final FragmentParallelizer 
fragmentParallelizer) {
    +    this.level = level;
    +    this.fragmentParallelizer = fragmentParallelizer;
    +  }
    +
    +  public FragmentParallelizer getFragmentParallelizer() {
    +    return fragmentParallelizer;
    +  }
    +
    +  /**
    +   * Is the current DistributionAffinity less or equal restrictive than 
the given DistributionAffinity?
    +   * @param distributionAffinity
    +   * @return
    +   */
    +  public boolean isLessOrEqualRestrictive(final DistributionAffinity 
distributionAffinity) {
    --- End diff --
    
    name suggestion change: lessThanOrEqualTo? At the same time, this enum 
implements the Comparable<DistributionAffinity> interface, so as long as the 
level is the same as the index, it might be enough...


> Improve current fragment parallelization module
> -----------------------------------------------
>
>                 Key: DRILL-4446
>                 URL: https://issues.apache.org/jira/browse/DRILL-4446
>             Project: Apache Drill
>          Issue Type: New Feature
>    Affects Versions: 1.5.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>             Fix For: 1.6.0
>
>
> Current fragment parallelizer {{SimpleParallelizer.java}} can’t handle 
> correctly the case where an operator has mandatory scheduling requirement for 
> a set of DrillbitEndpoints and affinity for each DrillbitEndpoint (i.e how 
> much portion of the total tasks to be scheduled on each DrillbitEndpoint). It 
> assumes that scheduling requirements are soft (except one case where Mux and 
> DeMux case where mandatory parallelization requirement of 1 unit). 
> An example is:
> Cluster has 3 nodes running Drillbits and storage service on each. Data for a 
> table is only present at storage services in two nodes. So a GroupScan needs 
> to be scheduled on these two nodes in order to read the data. Storage service 
> doesn't support (or costly) reading data from remote node.
> Inserting the mandatory scheduling requirements within existing 
> SimpleParallelizer is not sufficient as you may end up with a plan that has a 
> fragment with two GroupScans each having its own hard parallelization 
> requirements.
> Proposal is:
> Add a property to each operator which tells what parallelization 
> implementation to use. Most operators don't have any particular strategy 
> (such as Project or Filter), they depend on incoming operator. Current 
> existing operators which have requirements (all existing GroupScans) default 
> to current parallelizer {{SimpleParallelizer}}. {{Screen}} defaults to new 
> mandatory assignment parallelizer. It is possible that PhysicalPlan generated 
> can have a fragment with operators having different parallelization 
> strategies. In that case an exchange is inserted in between operators where a 
> change in parallelization strategy is required.
> Will send a detailed design doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to