Github user sujith71955 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22721#discussion_r231795480
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
 ---
    @@ -183,13 +183,14 @@ case class InsertIntoHadoopFsRelationCommand(
             refreshUpdatedPartitions(updatedPartitionPaths)
           }
     
    -      // refresh cached files in FileIndex
    -      fileIndex.foreach(_.refresh())
    -      // refresh data cache if table is cached
    -      sparkSession.catalog.refreshByPath(outputPath.toString)
    -
           if (catalogTable.nonEmpty) {
    +        
sparkSession.sessionState.catalog.refreshTable(catalogTable.get.identifier)
    --- End diff --
    
    if we initialize the stats, then the updateTableStats flow will be executed 
where we are also updating the table stats and invalidating the cache. and this 
will ensure the consistency in insert flow.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to