GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/13576
[SPARK-15840][SQL] Add missing options in documentation, inferSchema for CSV and mergeSchema for Parquet ## What changes were proposed in this pull request? This PR 1. Adds the documentations for some missing options, `inferSchema` and `mergeSchema` for Python and Scala. 2. Fiixes `[[DataFrame]]` to ```:class:`DataFrame` ``` so that this can be shown - from ![2016-06-09 9 31 16](https://cloud.githubusercontent.com/assets/6477701/15929721/8b864734-2e89-11e6-83f6-207527de4ac9.png) - to (with class link) ![2016-06-09 9 31 00](https://cloud.githubusercontent.com/assets/6477701/15929717/8a03d728-2e89-11e6-8a3f-08294964db22.png) (Please refer the documentation, https://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/api/python/pyspark.sql.html) 3. Moves `mergeSchema` option to `ParquetOptions` with removing unused options, `metastoreSchema` and `metastoreTableName`. They are not used anymore. They were removed in https://github.com/apache/spark/commit/e720dda42e806229ccfd970055c7b8a93eb447bf and there are no use cases as below: ```bash grep -r -e METASTORE_SCHEMA -e \"metastoreSchema\" -e \"metastoreTableName\" -e METASTORE_TABLE_NAME . ``` ``` ./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala: private[sql] val METASTORE_SCHEMA = "metastoreSchema" ./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala: private[sql] val METASTORE_TABLE_NAME = "metastoreTableName" ./sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala: ParquetFileFormat.METASTORE_TABLE_NAME -> TableIdentifier( ``` It only sets `metastoreTableName` in the last case but does not use the table name. ## How was this patch tested? Existing tests should cover this. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-15840 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13576.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13576 ---- commit bc42993ef6734b168782d656c853979b46e453fe Author: hyukjinkwon <gurwls...@gmail.com> Date: 2016-06-09T12:18:41Z Missing options commit c764a25330a8e6231007eb5328bef79f9063e380 Author: hyukjinkwon <gurwls...@gmail.com> Date: 2016-06-09T12:21:29Z Fix typoes commit bce6c040df1c2f2d5eee18112cb0fddf8826df13 Author: hyukjinkwon <gurwls...@gmail.com> Date: 2016-06-09T12:26:46Z More detailed explanation ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org