I'm looking into using Python interface with Spark and came across this [1] 
chart showing some performance hit when going with Python RDD. Data is ~ 7 
years and for older version of Spark. Is this still the case with more recent 
Spark releases?

I'm trying to understand what to expect from Python and Spark and under what 
conditions.

[1] 
https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html

Thanks,
//hinko
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to