[jira] [Commented] (SPARK-22112) Add missing method to pyspark api: spark.read.csv(Dataset)

Hyukjin Kwon (JIRA) Sun, 24 Sep 2017 21:20:57 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-22112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178515#comment-16178515
 ]


Hyukjin Kwon commented on SPARK-22112:
--------------------------------------

BTW, I think we should pass {{RDD}} instead of a dataframe with single string 
column as the parameter, to be consistent with PySpark's {{spark.read.json}} 
(and due to missing dataset concept in PySpark).

> Add missing method to pyspark api: spark.read.csv(Dataset<String>)
> ------------------------------------------------------------------
>
>                 Key: SPARK-22112
>                 URL: https://issues.apache.org/jira/browse/SPARK-22112
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 2.2.0
>            Reporter: Andrew Ash
>
> https://issues.apache.org/jira/browse/SPARK-15463 added a method to the scala 
> API without adding an equivalent in pyspark: 
> {{spark.read.csv(Dataset<String>)}}
> I was writing some things with pyspark but had to switch it to scala/java to 
> use that method -- since equivalency between python/java/scala is a Spark 
> goal, we should make sure this functionality exists in all the supported 
> languages.
> https://github.com/apache/spark/pull/16854/files#diff-f70bda59304588cc3abfa3a9840653f4R408
> cc [~hyukjin.kwon]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-22112) Add missing method to pyspark api: spark.read.csv(Dataset)

Reply via email to