Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19080#discussion_r136213879
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
 ---
    @@ -30,18 +30,43 @@ import org.apache.spark.sql.types.{DataType, 
IntegerType}
      *  - Intra-partition ordering of data: In this case the distribution 
describes guarantees made
      *    about how tuples are distributed within a single partition.
      */
    -sealed trait Distribution
    +sealed trait Distribution {
    +  /**
    +   * The required number of partitions for this distribution. If it's 
None, then any number of
    +   * partitions is allowed for this distribution.
    +   */
    +  def requiredNumPartitions: Option[Int]
    +
    +  /**
    +   * Creates a default partitioning for this distribution, which can 
satisfy this distribution while
    +   * matching the given number of partitions.
    +   */
    +  def createPartitioning(numPartitions: Int): Partitioning
    +}
     
     /**
      * Represents a distribution where no promises are made about co-location 
of data.
      */
    -case object UnspecifiedDistribution extends Distribution
    +case object UnspecifiedDistribution extends Distribution {
    +  override def requiredNumPartitions: Option[Int] = None
    +
    +  override def createPartitioning(numPartitions: Int): Partitioning = {
    +    throw new IllegalStateException("UnspecifiedDistribution does not have 
default partitioning.")
    +  }
    +}
     
     /**
      * Represents a distribution that only has a single partition and all 
tuples of the dataset
      * are co-located.
      */
    -case object AllTuples extends Distribution
    +case object AllTuples extends Distribution {
    --- End diff --
    
    Sure you can make that argument (I'm not sure I buy it), but then why does 
this PR still have two different concepts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to