Re: Trouble with PySpark UDFs and SPARK_HOME only on EMR

2017-06-22 Thread Nicholas Chammas
Here’s a repro for a very similar issue where Spark hangs on the UDF, which I think is related to the SPARK_HOME issue. I posted the repro on the EMR forum , but in case you can’t access it: 1. I’m running EMR 5.6.0, Spark 2.1.1, and

Trouble with PySpark UDFs and SPARK_HOME only on EMR

2017-06-22 Thread Nick Chammas
I’m seeing a strange issue on EMR which I posted about here . In brief, when I try to import a UDF I’ve defined, Python somehow fails to find Spark. This exact code works for me locally and works on our on-premises CDH