Re: UDF issues with spark

2017-12-10 Thread Daniel Haviv
Some code would help to debug the issue On Fri, 8 Dec 2017 at 21:54 Afshin, Bardia < bardia.afs...@changehealthcare.com> wrote: > Using pyspark cli on spark 2.1.1 I’m getting out of memory issues when > running the udf function on a recordset count of 10 with a mapping of the > same value

UDF issues with spark

2017-12-08 Thread Afshin, Bardia
Using pyspark cli on spark 2.1.1 I’m getting out of memory issues when running the udf function on a recordset count of 10 with a mapping of the same value (arbirtrary for testing purposes). This is on amazon EMR release label 5.6.0 with the following hardware specs m4.4xlarge 32 vCPU, 64 GiB