Re: Role-based S3 access outside of EMR

2016-08-14 Thread Steve Loughran
On 29 Jul 2016, at 00:07, Everett Anderson mailto:ever...@nuna.com.invalid>> wrote: Hey, Just wrapping this up -- I ended up following the instructions to build a custom Spark release with Hadoop 2.7.2, stealing from Steve's SPARK-7481

Re: Role-based S3 access outside of EMR

2016-07-28 Thread Everett Anderson
; > >> > >> On Tue, Jul 19, 2016 at 2:47 PM, Andy Davidson > >> wrote: > >>> > >>> Hi Everett > >>> > >>> I always do my initial data exploration and all our product development > >>> in my local dev env. I

Re: Role-based S3 access outside of EMR

2016-07-23 Thread Steve Loughran
apache.org>> Subject: Re: Role-based S3 access outside of EMR Hey, FWIW, we are using EMR, actually, in production. The main case I have for wanting to access S3 with Spark outside of EMR is that during development, our developers tend to run EC2 sandbox instances that have all the res

RE: Role-based S3 access outside of EMR

2016-07-21 Thread Ewan Leith
Sengupta Cc: Teng Qiu ; Andy Davidson ; user Subject: Re: Role-based S3 access outside of EMR Hey, FWIW, we are using EMR, actually, in production. The main case I have for wanting to access S3 with Spark outside of EMR is that during development, our developers tend to run EC2 sandbox instances

Re: Role-based S3 access outside of EMR

2016-07-21 Thread Everett Anderson
exploration and all our product >> development >> >>> in my local dev env. I typically select a small data set and copy it >> to my >> >>> local machine >> >>> >> >>> My main() has an optional command line argument ‘- - runLocal’

Re: Role-based S3 access outside of EMR

2016-07-21 Thread Gourav Sengupta
l machine > >>> > >>> My main() has an optional command line argument ‘- - runLocal’ > Normally I > >>> load data from either hdfs:/// or S3n:// . If the arg is set I read > from > >>> file:/// > >>> > >>> Sometime I us

Re: Role-based S3 access outside of EMR

2016-07-21 Thread Teng Qiu
So in your case I would log into my data cluster and use “AWS s3 cp" to >>> copy the data into my cluster and then use “SCP” to copy the data from the >>> data center back to my local env. >>> >>> Andy >>> >>> From: Everett Anderson >>

Re: Role-based S3 access outside of EMR

2016-07-20 Thread Gourav Sengupta
cluster and then use “SCP” to copy the data from the >> data center back to my local env. >> >> Andy >> >> From: Everett Anderson >> Date: Tuesday, July 19, 2016 at 2:30 PM >> To: "user @spark" >> Subject: Role-based S3 access outside

Re: Role-based S3 access outside of EMR

2016-07-20 Thread Everett Anderson
copy the data from the > data center back to my local env. > > Andy > > From: Everett Anderson > Date: Tuesday, July 19, 2016 at 2:30 PM > To: "user @spark" > Subject: Role-based S3 access outside of EMR > > Hi, > > When running on EMR, AWS configu

Re: Role-based S3 access outside of EMR

2016-07-19 Thread Andy Davidson
Date: Tuesday, July 19, 2016 at 2:30 PM To: "user @spark" Subject: Role-based S3 access outside of EMR > Hi, > > When running on EMR, AWS configures Hadoop to use their EMRFS Hadoop > FileSystem implementation for s3:// URLs and seems to install the necessary S3 > c

Role-based S3 access outside of EMR

2016-07-19 Thread Everett Anderson
Hi, When running on EMR, AWS configures Hadoop to use their EMRFS Hadoop FileSystem implementation for s3:// URLs and seems to install the necessary S3 credentials properties, as well. Often, it's nice during development to run outside of a cluster even with the "local" Spark master, though, whic