Issue with pyspark query

Tzahi File Wed, 10 Jun 2020 04:24:43 -0700

Hi,

This is a general question regarding moving spark SQL query to PySpark, if
needed I will add some more from the errors log and query syntax.
I'm trying to move a spark SQL query to run through PySpark.
The query syntax and spark configuration are the same.
For some reason the query failed to run through PySpark with an java heap
space error.
In the Spark SQL query I'm using insert overwrite partition, while in
pyspark I'm using DF to write the data to a specific location in S3.


Are there any differences in the configuration that you might think I need
to change?


Thanks,

-- 
Tzahi
Data Engineer

Issue with pyspark query

Reply via email to