[GitHub] spark pull request #22721: [SPARK-25403][SQL] Refreshes the table after inse...

sujith71955 Thu, 08 Nov 2018 00:19:23 -0800

Github user sujith71955 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22721#discussion_r231795480
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala
 ---
    @@ -183,13 +183,14 @@ case class InsertIntoHadoopFsRelationCommand(
             refreshUpdatedPartitions(updatedPartitionPaths)
           }
     
    -      // refresh cached files in FileIndex
    -      fileIndex.foreach(_.refresh())
    -      // refresh data cache if table is cached
    -      sparkSession.catalog.refreshByPath(outputPath.toString)
    -
           if (catalogTable.nonEmpty) {
    +        
sparkSession.sessionState.catalog.refreshTable(catalogTable.get.identifier)
    --- End diff --
    
    if we initialize the stats, then the updateTableStats flow will be executed 
where we are also updating the table stats and invalidating the cache. and this 
will ensure the consistency in insert flow.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22721: [SPARK-25403][SQL] Refreshes the table after inse...

Reply via email to