[GitHub] spark pull request #19080: [SPARK-21865][SQL] simplify the distribution sema...

cloud-fan Sun, 07 Jan 2018 22:10:36 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19080#discussion_r160079396
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
 ---
    @@ -73,46 +127,31 @@ case class OrderedDistribution(ordering: 
Seq[SortOrder]) extends Distribution {
           "An AllTuples should be used to represent a distribution that only 
has " +
           "a single partition.")
     
    -  // TODO: This is not really valid...
    -  def clustering: Set[Expression] = ordering.map(_.child).toSet
    +  override def requiredNumPartitions: Option[Int] = None
    +
    +  override def createPartitioning(numPartitions: Int): Partitioning = {
    +    RangePartitioning(ordering, numPartitions)
    +  }
     }
     
     /**
      * Represents data where tuples are broadcasted to every node. It is quite 
common that the
      * entire set of tuples is transformed into different data structure.
      */
    -case class BroadcastDistribution(mode: BroadcastMode) extends Distribution
    +case class BroadcastDistribution(mode: BroadcastMode) extends Distribution 
{
    --- End diff --
    
    yea good idea, but again, this is an existing problem, let's fix it in 
another PR.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19080: [SPARK-21865][SQL] simplify the distribution sema...

Reply via email to