subject:"Reading RDD by \(key, data\) from s3"

Re: Reading RDD by (key, data) from s3

2019-04-16 Thread yujhe.li

You can't, sparkcontext is a singleton object. You have to use hadoop library or aws client to read files on s3. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail:

Reading RDD by (key, data) from s3

2019-04-16 Thread Gorka Bravo Martinez

Hi, I am trying to read gzipped json data from s3, my idea would be to do => data = (s3_keys .mapValues(lambda x: x, s3_read_data(x) ) for that I though about using sc.textFile instead of s3_read_data, but wouldn't work. Any idea how to achieve a solution in here?