>From Python developer perspective, this direction sounds making sense to me.
As pandas is almost the standard library in the related area, if PySpark
supports pandas API out of box, the usability would be in a higher level.
For maintenance cost, IIUC, there are some Spark committers in the commun
I think having pandas support inside of Spark makes sense. One of my
questions is who are the majour contributors to this effort, is the
community developing the pandas API layer for Spark interested in being
part of Spark or do they prefer having their own release cycle?
On Sat, Mar 13, 2021 at 5
Hi all,
I would like to start the discussion on supporting pandas API layer on
Spark.
If we have a general consensus on having it in PySpark, I will initiate and
drive an SPIP with a detailed explanation about the implementation’s
overview and structure.
I would appreciate it if I can know whe