Have a look at this SO
http://stackoverflow.com/questions/24048729/how-to-read-input-from-s3-in-a-spark-streaming-ec2-cluster-application
question,
it has discussion on various ways of accessing S3.
Thanks
Best Regards
On Fri, May 8, 2015 at 1:21 AM, in4maniac sa...@skimlinks.com wrote:
Hi
HI GUYS... I realised that it was a bug in my code that caused the code to
break.. I was running the filter on a SchemaRDD when I was supposed to be
running it on an RDD.
But I still don't understand why the stderr was about S3 request rather than
a type checking error such as No tuple position
Hi Guys,
I think this problem is related to :
http://apache-spark-user-list.1001560.n3.nabble.com/AWS-Credentials-for-private-S3-reads-td8689.html
I am running pyspark 1.2.1 in AWS with my AWS credentials exported to master
node as Environmental Variables.
Halfway through my application, I