Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/19487#discussion_r144633605 --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala --- @@ -48,6 +49,16 @@ class HadoopMapReduceCommitProtocol(jobId: String, path: String) @transient private var committer: OutputCommitter = _ /** + * Checks whether there are files to be committed to a valid output location. + * + * As committing and aborting a job occurs on driver, where `addedAbsPathFiles` is always null, + * it is necessary to check whether a valid output path is specified. + * [[HadoopMapReduceCommitProtocol#path]] need not be a valid [[org.apache.hadoop.fs.Path]] for + * committers not writing to distributed file systems. + */ + private val hasValidPath = Try { new Path(path) }.isSuccess --- End diff -- That would depend on whether we want to support invalid paths or not (please see my comment below). If we are not supporting invalid paths, I will change this to `null != path && "" != path` explicitly - and have driver throw `IllegalArgumentException` as part of `commitJob` or `abortJob` as earlier. If we do want to support invalid paths, then exception is not irrelevant : since it indicates an explicit invalid path passed in to output committer (and output committer will suitably log in case parameter is invalid; it is not `HadoopMapReduceCommitProtocol`'s responsibility to do so).
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org