[GitHub] spark pull request #21886: [SPARK-21274][SQL] Implement INTERSECT ALL clause

gatorsmile Fri, 27 Jul 2018 22:23:45 -0700

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21886#discussion_r205933541
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -1934,6 +1934,23 @@ class Dataset[T] private[sql](
         Intersect(planWithBarrier, other.planWithBarrier)
       }
     
    +  /**
    +   * Returns a new Dataset containing rows only in both this Dataset and 
another Dataset while
    +   * preserving the duplicates.
    +   * This is equivalent to `INTERSECT ALL` in SQL.
    +   *
    +   * @note Equality checking is performed directly on the encoded 
representation of the data
    +   * and thus is not affected by a custom `equals` function defined on 
`T`. Also as standard
    +   * in SQL, this function resolves columns by position (not by name).
    +   *
    +   * @group typedrel
    +   * @since 2.4.0
    +   */
    +  def intersectAll(other: Dataset[T]): Dataset[T] = withSetOperator {
    +    Intersect(planWithBarrier, other.planWithBarrier, isAll = true)
    --- End diff --
    
    could you use logicalPlan?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21886: [SPARK-21274][SQL] Implement INTERSECT ALL clause

Reply via email to