Re: Using Lambda function to generate random data in PySpark throws not defined error

2020-12-11 Thread Mich Talebzadeh
many thanks KR. If i call the clusterted function on its own it works numRows = 10 print(uf.clustered(200,numRows)) and returns 0.00199 If I run all in one including the UsedFunctions claa in the same py file it works. The code is attached However, in PyCharm, I do the following

Re: Using Lambda function to generate random data in PySpark throws not defined error

2020-12-11 Thread Sofia’s World
copying and pasting your code code in a jup notebook works fine. that is, using my own version of Range which is simply a list of numbers how bout this.. does this work fine? list(map(lambda x: (x, clustered(x, numRows)),[1,2,3,4])) If it does, i'd look in what's inside your Range and what you

Re: Using Lambda function to generate random data in PySpark throws not defined error

2020-12-11 Thread Mich Talebzadeh
Sorry, part of the code is not that visible rdd = sc.parallelize(Range). \ map(lambda x: (x, uf.clustered(x, numRows), \ uf.scattered(x,1), \ uf.randomised(x,1), \ uf.randomString(50), \

Re: Using Lambda function to generate random data in PySpark throws not defined error

2020-12-11 Thread Mich Talebzadeh
Thanks Sean, This is the code numRows = 10 ## do in increment of 50K rows otherwise you blow up driver memory! # ## Check if table exist otherwise create it rows = 0 sqltext = "" if (spark.sql(f"SHOW TABLES IN {DB} like '{tableName}'").count() == 1): rows = spark.sql(f"""SELECT

Re: Using Lambda function to generate random data in PySpark throws not defined error

2020-12-11 Thread Sean Owen
Looks like a simple Python error - you haven't shown the code that produces it. Indeed, I suspect you'll find there is no such symbol. On Fri, Dec 11, 2020 at 9:09 AM Mich Talebzadeh wrote: > Hi, > > This used to work but not anymore. > > I have UsedFunctions.py file that has these functions >

Using Lambda function to generate random data in PySpark throws not defined error

2020-12-11 Thread Mich Talebzadeh
Hi, This used to work but not anymore. I have UsedFunctions.py file that has these functions import random import string import math def randomString(length): letters = string.ascii_letters result_str = ''.join(random.choice(letters) for i in range(length)) return result_str def

Re: mysql connector java issue

2020-12-11 Thread Artemis User
Well, this just won't work when you are running Spark on Hadoop... On 12/10/20 9:14 PM, lec ssmi wrote: If you can not assembly the jdbc driver jar in your application jar package, you can put the jdbc driver jar in the spark classpath, generally, $SPARK_HOME/jars  or $SPARK_HOME/lib.