Github user liancheng commented on the issue:

    https://github.com/apache/spark/pull/20174
  
    @mgaido91 We can't because we do not know whether there are any input rows 
or not. For example:
    
    ```scala
    val df1 = spark.range(10).select()
    val df2 = spark.range(10).filter($"id" < 0).select()
    val df3 = df1.dropDuplicates()
    val df4 = df2.dropDuplicates()
    ```
    
    `df1` has zero columns and ten rows while `df2` has no columns and zero 
rows. Therefore, `df3` should return one row containing zero columns while 
`df4` should return zero rows.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to