Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16664#discussion_r100366124 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -218,7 +247,14 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) { bucketSpec = getBucketSpec, options = extraOptions.toMap) - dataSource.write(mode, df) + val destination = source match { + case "jdbc" => extraOptions.get(JDBCOptions.JDBC_TABLE_NAME) + case _ => extraOptions.get("path") --- End diff -- > Actually all the "magic keys" in the options used by DataFrameWriter are public APIs That's good to know, but they only seem to be, at best, indirectly documented. The `DataFrameWriter` API doesn't say anything about the keys used by any of the methods, and `sql-programming-guide.md` only touches on a handful of them; for example, none of the JDBC keys are documented. > If you want to introduce an external public interface, we need a careful design. This should be done in a separate PR. I agree that it needs a careful design and the current one doesn't cover all the options. But this PR is of very marginal value without this information being exposed in some way. If you guys feel strongly that it should be a map and that's it, I guess it will be hard to argue. Then we'll have to do that and document all the keys used internally by Spark and make them public, and promise ourselves that we'll never break them. My belief is that a more structured type would help here. Since the current code is obviously not enough, we could have something that's more future-proof, like: ``` // Generic, just exposes the raw options, no stability guarantee past what SQL API provides. class QueryExecutionParams(options: Map[]) // For FS-based sources class FsOutputParams(dataSourceType: String, path: String, options: Map[]) extends QueryExecutionParams // For JDBC class JdbcOutputParams(table: String, url: String, options: Map[]) extends QueryExecutionParams // Add others that are interesting. ``` Then listeners can easily handle future params by matching and handling the generic params. Anyway, my opinion is that a raw map is not a very good API, regardless of API stability; it's hard to use and easy to break. But I'll defer to you guys if you really don't like my suggestions.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org