subject:"Well balanced Python code with Pandas compared to PySpark"

Re: Well balanced Python code with Pandas compared to PySpark

2021-07-29 Thread Mich Talebzadeh

Yes indeed very good points by the Artemis User. Just to add if I may, why choose Spark? Generally, parallel architecture comes into play when the data size is significantly large which cannot be handled on a single machine, hence, the use of Spark becomes meaningful. In cases where (the generate

Re: Well balanced Python code with Pandas compared to PySpark

2021-07-29 Thread Artemis User

PySpark still uses Spark dataframe underneath (it wraps java code). Use PySpark when you have to deal with big data ETL and analytics so you can leverage the distributed architecture in Spark. If you job is simple, dataset is relatively small, and doesn't require distributed processing, use Pa

Well balanced Python code with Pandas compared to PySpark

2021-07-29 Thread ashok34...@yahoo.com.INVALID

Hello team Someone asked me regarding well developed Python code with Panda dataframe and comparing that to PySpark. Under what situations one choose PySpark instead of Python and Pandas. Appreciate AK

Re: Well balanced Python code with Pandas compared to PySpark

Re: Well balanced Python code with Pandas compared to PySpark

Well balanced Python code with Pandas compared to PySpark

3 matches

Site Navigation

Mail list logo

Footer information