Re: Using Lambda function to generate random data in PySpark throws not defined error

2020-12-12 Thread Sofia’s World
Hi Mich i dont think it's a good idea... I believe your IDE is playing tricks on you. Take spark out of the equation this is a python issue only. i am guessing your IDE is somehow messing up your environment. if you take out the whole spark code and replace it by this code map(lambda x:

Re: Using Lambda function to generate random data in PySpark throws not defined error

2020-12-12 Thread Mich Talebzadeh
I solved the issue of variable numRows within the lambda function not defined by defining it as a Global variable global numRows numRows = 10 ## do in increment of 50K rows otherwise you blow up driver memory! # Then I could call it within the lambda function as follows rdd =