[jira] [Commented] (SPARK-26022) PySpark Comparison with Pandas

2019-09-18 Thread Xiao Li (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932610#comment-16932610
 ] 

Xiao Li commented on SPARK-26022:
-

[https://github.com/databricks/koalas] is to close the gap. 

> PySpark Comparison with Pandas
> --
>
> Key: SPARK-26022
> URL: https://issues.apache.org/jira/browse/SPARK-26022
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Assignee: Hyukjin Kwon
>Priority: Major
>
> It would be very nice if we can have a doc like 
> https://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html to show 
> the API difference between PySpark and Pandas. 
> Reference:
> https://www.kdnuggets.com/2016/01/python-data-science-pandas-spark-dataframe-differences.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26022) PySpark Comparison with Pandas

2019-01-07 Thread Hyukjin Kwon (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735618#comment-16735618
 ] 

Hyukjin Kwon commented on SPARK-26022:
--

I'm working on this.

> PySpark Comparison with Pandas
> --
>
> Key: SPARK-26022
> URL: https://issues.apache.org/jira/browse/SPARK-26022
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> It would be very nice if we can have a doc like 
> https://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html to show 
> the API difference between PySpark and Pandas. 
> Reference:
> https://www.kdnuggets.com/2016/01/python-data-science-pandas-spark-dataframe-differences.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26022) PySpark Comparison with Pandas

2018-11-12 Thread Hyukjin Kwon (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684502#comment-16684502
 ] 

Hyukjin Kwon commented on SPARK-26022:
--

Yup, will do. Thanks for cc'ing me.

> PySpark Comparison with Pandas
> --
>
> Key: SPARK-26022
> URL: https://issues.apache.org/jira/browse/SPARK-26022
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> It would be very nice if we can have a doc like 
> https://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html to show 
> the API difference between PySpark and Pandas. 
> Reference:
> https://www.kdnuggets.com/2016/01/python-data-science-pandas-spark-dataframe-differences.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26022) PySpark Comparison with Pandas

2018-11-12 Thread Xiao Li (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16684283#comment-16684283
 ] 

Xiao Li commented on SPARK-26022:
-

[~hyukjin.kwon] Could you lead this effort to help the community create such a 
doc and show the API/semantics difference between PySpark and Pandas. It will 
help the community migrate their workloads from Pandas to PySpark.

> PySpark Comparison with Pandas
> --
>
> Key: SPARK-26022
> URL: https://issues.apache.org/jira/browse/SPARK-26022
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.0.0
>Reporter: Xiao Li
>Priority: Major
>
> It would be very nice if we can have a doc like 
> https://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html to show 
> the API difference between PySpark and Pandas. 
> Reference:
> https://www.kdnuggets.com/2016/01/python-data-science-pandas-spark-dataframe-differences.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org