Re: Spark on EMR with S3 example (Python)

2015-07-15 Thread Akhil Das
on Amazon. Do I still need to provide the keys? Thank you, *From:* Sujit Pal [mailto:sujitatgt...@gmail.com] *Sent:* Tuesday, July 14, 2015 3:14 PM *To:* Pagliari, Roberto *Cc:* user@spark.apache.org *Subject:* Re: Spark on EMR with S3 example (Python) Hi Roberto, I have written

Re: Spark on EMR with S3 example (Python)

2015-07-15 Thread Sujit Pal
to provide the keys? Thank you, *From:* Sujit Pal [mailto:sujitatgt...@gmail.com] *Sent:* Tuesday, July 14, 2015 3:14 PM *To:* Pagliari, Roberto *Cc:* user@spark.apache.org *Subject:* Re: Spark on EMR with S3 example (Python) Hi Roberto, I have written PySpark code that reads

Re: Spark on EMR with S3 example (Python)

2015-07-14 Thread Sujit Pal
Hi Roberto, I have written PySpark code that reads from private S3 buckets, it should be similar for public S3 buckets as well. You need to set the AWS access and secret keys into the SparkContext, then you can access the S3 folders and files with their s3n:// paths. Something like this: sc =

RE: Spark on EMR with S3 example (Python)

2015-07-14 Thread Pagliari, Roberto
Hi Sujit, I just wanted to access public datasets on Amazon. Do I still need to provide the keys? Thank you, From: Sujit Pal [mailto:sujitatgt...@gmail.com] Sent: Tuesday, July 14, 2015 3:14 PM To: Pagliari, Roberto Cc: user@spark.apache.org Subject: Re: Spark on EMR with S3 example (Python