Github user wzhfy commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19831#discussion_r153677676
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
---
    @@ -418,7 +418,7 @@ private[hive] class HiveClientImpl(
           // Note that this statistics could be overridden by Spark's 
statistics if that's available.
           val totalSize = 
properties.get(StatsSetupConst.TOTAL_SIZE).map(BigInt(_))
           val rawDataSize = 
properties.get(StatsSetupConst.RAW_DATA_SIZE).map(BigInt(_))
    -      val rowCount = 
properties.get(StatsSetupConst.ROW_COUNT).map(BigInt(_)).filter(_ >= 0)
    +      val rowCount = 
properties.get(StatsSetupConst.ROW_COUNT).map(BigInt(_)).filter(_ > 0)
    --- End diff --
    
    The root problem is that user can set "wrong" table properties. So if we 
want to prevent using wrong stats, we need to detect changes in properties. 
Otherwise your case can't be avoided.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to