I have come across occasions when the teams use Python with Spark for ETL,
for example processing data from S3 buckets into Snowflake with Spark.

The only reason I think they are choosing Python as opposed to Scala is
because they are more familiar with Python. Since Spark is written in
Scala, itself is an indication of why I think Scala has an edge.

I have not done one to one comparison of Spark with Scala vs Spark with
Python. I understand for data science purposes most libraries like
TensorFlow etc. are written in Python but I am at loss to understand the
validity of using Python with Spark for ETL purposes.

These are my understanding but they are not facts so I would like to get
some informed views on this if I can?

Many thanks,

Mich




LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*





*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Reply via email to