[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

gatorsmile Thu, 29 Nov 2018 17:15:02 -0800

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23152#discussion_r237717671
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
 ---
    @@ -879,13 +879,13 @@ case class ColumnStatsMap(originalMap: 
AttributeMap[ColumnStat]) {
       }
     
       def hasCountStats(a: Attribute): Boolean =
    -    get(a).map(_.hasCountStats).getOrElse(false)
    +    get(a).exists(_.hasCountStats)
     
       def hasDistinctCount(a: Attribute): Boolean =
    -    get(a).map(_.distinctCount.isDefined).getOrElse(false)
    +    get(a).exists(_.distinctCount.isDefined)
     
       def hasMinMaxStats(a: Attribute): Boolean =
    -    get(a).map(_.hasCountStats).getOrElse(false)
    +    get(a).exists(_.hasMinMaxStats)
    --- End diff --
    
    This is a copy-and-paste bug. You can reproduce it by
    ```
    spark.sql("create table Foo1(a int)")
    spark.sql("insert into Foo1 values (null)")
    spark.sql("analyze table Foo1 compute statistics for columns a")
    spark.sql("select * from Foo1 where a < 1").queryExecution.stringWithStats
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

Reply via email to