[ https://issues.apache.org/jira/browse/SPARK-20055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-20055: ------------------------------------ Assignee: Apache Spark > Documentation for CSV datasets in SQL programming guide > ------------------------------------------------------- > > Key: SPARK-20055 > URL: https://issues.apache.org/jira/browse/SPARK-20055 > Project: Spark > Issue Type: Improvement > Components: Documentation > Affects Versions: 2.2.0 > Reporter: Hyukjin Kwon > Assignee: Apache Spark > > I guess things commonly used and important are documented there rather than > documenting everything and every option in the programming guide - > http://spark.apache.org/docs/latest/sql-programming-guide.html. > It seems JSON datasets > http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets > are documented whereas CSV datasets are not. > Nowadays, they are pretty similar in APIs and options. Some options are > notable for both, In particular, ones such as {{wholeFile}}. Moreover, > several options such as {{inferSchema}} and {{header}} are important in CSV > that affect the type/column name of data. > In that sense, I think we might better document CSV datasets with some > examples too because I believe reading CSV is pretty much common use cases. > Also, I think we could also leave some pointers for options of API > documentations for both (rather than duplicating the documentation). > So, my suggestion is, > - Add CSV Datasets section. > - Add links for options for both JSON and CSV that point each API > documentation > - Fix trivial minor fixes together in both sections. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org