Github user sujith71955 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22758#discussion_r228758621
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
    @@ -193,6 +193,16 @@ private[hive] class HiveMetastoreCatalog(sparkSession: 
SparkSession) extends Log
               None)
             val logicalRelation = cached.getOrElse {
               val updatedTable = inferIfNeeded(relation, options, fileFormat)
    +          // Intialize the catalogTable stats if its not defined.An intial 
value has to be defined
    --- End diff --
    
    > > but after create table command, when we do insert command within the 
same session Hive statistics is not getting updated
    > 
    > This is the thing I don't understand. Like I said before, even if table 
has no stats, Spark will still get a stats via the `DetermineTableStats` rule.
    
    @cloud-fan DetermineStats is just initializing the stats if the stats is 
not set, only if session.sessionState.conf.fallBackToHdfsForStatsEnabled is 
true then the rule is deriving the stats from file system and updating the 
stats as shown below code snippet. In insert flow  this condition never gets 
executed, so the stats will be still none.
    
![image](https://user-images.githubusercontent.com/12999161/47619998-e3096600-db0a-11e8-9315-fa0d18be0860.png)



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to