many thanks KR.
If i call the clusterted function on its own it works
numRows = 10
print(uf.clustered(200,numRows))
and returns
0.00199
If I run all in one including the UsedFunctions claa in the same py file it
works. The code is attached
However, in PyCharm, I do the following
UsedFunc
copying and pasting your code code in a jup notebook works fine. that is,
using my own version of Range which is simply a list of numbers
how bout this.. does this work fine?
list(map(lambda x: (x, clustered(x, numRows)),[1,2,3,4]))
If it does, i'd look in what's inside your Range and what you ge
Sorry, part of the code is not that visible
rdd = sc.parallelize(Range). \
map(lambda x: (x, uf.clustered(x, numRows), \
uf.scattered(x,1), \
uf.randomised(x,1), \
uf.randomString(50), \
Thanks Sean,
This is the code
numRows = 10 ## do in increment of 50K rows otherwise you blow
up driver memory!
#
## Check if table exist otherwise create it
rows = 0
sqltext = ""
if (spark.sql(f"SHOW TABLES IN {DB} like '{tableName}'").count() == 1):
rows = spark.sql(f"""SELECT COUNT(1
Looks like a simple Python error - you haven't shown the code that produces
it. Indeed, I suspect you'll find there is no such symbol.
On Fri, Dec 11, 2020 at 9:09 AM Mich Talebzadeh
wrote:
> Hi,
>
> This used to work but not anymore.
>
> I have UsedFunctions.py file that has these functions
>
>
Hi,
This used to work but not anymore.
I have UsedFunctions.py file that has these functions
import random
import string
import math
def randomString(length):
letters = string.ascii_letters
result_str = ''.join(random.choice(letters) for i in range(length))
return result_str
def cl
Well, this just won't work when you are running Spark on Hadoop...
On 12/10/20 9:14 PM, lec ssmi wrote:
If you can not assembly the jdbc driver jar in your application jar
package, you can put the jdbc driver jar in the spark classpath,
generally, $SPARK_HOME/jars or $SPARK_HOME/lib.
Artemi