GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/13576

    [SPARK-15840][SQL] Add missing options in documentation, inferSchema for 
CSV and mergeSchema for Parquet

    ## What changes were proposed in this pull request?
    
    This PR
    
    1. Adds the documentations for some missing options, `inferSchema` and 
`mergeSchema` for Python and Scala.
    
    
    2. Fiixes `[[DataFrame]]` to ```:class:`DataFrame` ``` so that this can be 
shown 
    
      - from
        ![2016-06-09 9 31 
16](https://cloud.githubusercontent.com/assets/6477701/15929721/8b864734-2e89-11e6-83f6-207527de4ac9.png)
    
      - to (with class link)
        ![2016-06-09 9 31 
00](https://cloud.githubusercontent.com/assets/6477701/15929717/8a03d728-2e89-11e6-8a3f-08294964db22.png)
    
      (Please refer the documentation, 
https://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/api/python/pyspark.sql.html)
    
    3. Moves `mergeSchema` option to `ParquetOptions` with removing unused 
options, `metastoreSchema` and `metastoreTableName`.
    
      They are not used anymore. They were removed in 
https://github.com/apache/spark/commit/e720dda42e806229ccfd970055c7b8a93eb447bf 
and there are no use cases as below:
    
      ```bash
      grep -r -e METASTORE_SCHEMA -e \"metastoreSchema\" -e 
\"metastoreTableName\" -e METASTORE_TABLE_NAME .
      ```
    
      ```
      
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala:
  private[sql] val METASTORE_SCHEMA = "metastoreSchema"
      
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala:
  private[sql] val METASTORE_TABLE_NAME = "metastoreTableName"
      
./sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala: 
       ParquetFileFormat.METASTORE_TABLE_NAME -> TableIdentifier(
    ```
    
      It only sets `metastoreTableName` in the last case but does not use the 
table name.
    
    ## How was this patch tested?
    
    Existing tests should cover this.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-15840

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13576.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13576
    
----
commit bc42993ef6734b168782d656c853979b46e453fe
Author: hyukjinkwon <gurwls...@gmail.com>
Date:   2016-06-09T12:18:41Z

    Missing options

commit c764a25330a8e6231007eb5328bef79f9063e380
Author: hyukjinkwon <gurwls...@gmail.com>
Date:   2016-06-09T12:21:29Z

    Fix typoes

commit bce6c040df1c2f2d5eee18112cb0fddf8826df13
Author: hyukjinkwon <gurwls...@gmail.com>
Date:   2016-06-09T12:26:46Z

    More detailed explanation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to