Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/22707#discussion_r240070378 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -227,18 +227,22 @@ case class InsertIntoHiveTable( // Newer Hive largely improves insert overwrite performance. As Spark uses older Hive // version and we may not want to catch up new Hive version every time. We delete the // Hive partition first and then load data file into the Hive partition. - if (oldPart.nonEmpty && overwrite) { - oldPart.get.storage.locationUri.foreach { uri => - val partitionPath = new Path(uri) - val fs = partitionPath.getFileSystem(hadoopConf) - if (fs.exists(partitionPath)) { - if (!fs.delete(partitionPath, true)) { - throw new RuntimeException( - "Cannot remove partition directory '" + partitionPath.toString) - } - // Don't let Hive do overwrite operation since it is slower. - doHiveOverwrite = false + if (overwrite) { + val oldPartitionPath = oldPart.flatMap(_.storage.locationUri.map(new Path(_))) + .getOrElse { + ExternalCatalogUtils.generatePartitionPath( + partitionSpec, + partitionColumnNames, + new Path(table.location)) --- End diff -- Oops, seems it's a mistake. The `oldPart` is empty. Thank you very much, I'll change the code.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org