Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/20824#discussion_r174829859 --- Diff: core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala --- @@ -145,15 +146,23 @@ object FileCommitProtocol { jobId: String, outputPath: String, dynamicPartitionOverwrite: Boolean = false): FileCommitProtocol = { + + logDebug(s"Creating committer $className; job $jobId; output=$outputPath;" + + s" dynamic=$dynamicPartitionOverwrite") val clazz = Utils.classForName(className).asInstanceOf[Class[FileCommitProtocol]] // First try the constructor with arguments (jobId: String, outputPath: String, // dynamicPartitionOverwrite: Boolean). // If that doesn't exist, try the one with (jobId: string, outputPath: String). try { val ctor = clazz.getDeclaredConstructor(classOf[String], classOf[String], classOf[Boolean]) + logDebug("Using (String, String, Boolean) constructor") ctor.newInstance(jobId, outputPath, dynamicPartitionOverwrite.asInstanceOf[java.lang.Boolean]) } catch { case _: NoSuchMethodException => + logDebug("Falling back to (String, String) constructor") + require(!dynamicPartitionOverwrite, + "Dynamic Partition Overwrite is enabled but" + + s" the committer ${className} does not have the appropriate constructor") --- End diff -- Problem is that the dynamic partition logic in `InsertIntoHadoopFsRelationCommand` assumes that rename() is a fast reliable operation you can do with any implementation of the FileCommitProtocol, sets itself up for it when enabled, then instantiates the inner committer, and carries on with the dynamic partitioning, irrespective of whether or not. rename() doesn't always work like that, breaking the rest of the algorithm. If the committer doesn't have that 3-arg constructor, you can't be confident that you can do that. To silently log and continue is to run the risk that the underlying committers commit algorithm isn't compatible with the algorithm. A fail-fast ensures that when the outcome is going to be unknown, you aren' t left trying to work out what's happened. Regarding your example, yes, it's in trouble. Problem is: how to differentiate that from subclasses which don't know anything at all about the new feature. you can't even look for an interface on the newly created object if the base class implements it; you are left with some dynamic probe of the instance.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org