Hi Gourav,
you mean by seting a different python environment while running pyspark?
Cheers, Gorka.
From: Gourav Sengupta [gourav.sengu...@gmail.com]
Sent: 17 April 2019 10:06
To: Gorka Bravo Martinez
Cc: user@spark.apache.org
Subject: Re: Boto3 library
Hi all,
I would like to send a boto/boto3 library while running pyspark with yarn
client mode, how is it possible?
I am aware sc.addFile() can add a .py file, is it the same for a library?
Cheers, Gorka.
Hi,
I am trying to read gzipped json data from s3, my idea would be to do =>
data = (s3_keys
.mapValues(lambda x: x, s3_read_data(x)
)
for that I though about using sc.textFile instead of s3_read_data, but wouldn't
work. Any idea how to achieve a solution in here?