Spark-csv error on read AWS s3a in spark 1.4.1

2015-11-10 Thread Zhang, Jingyu
A small csv file in S3. I use s3a://key:seckey@bucketname/a.csv It works for SparkContext pixelsStr: SparkContext = ctx.textFile(s3pathOrg); It works for Java Spark-csv as well Java code : DataFrame careerOneDF = sqlContext.read().format( "com.databricks.spark.csv")

Re: Spark task hangs infinitely when accessing S3 from AWS

2015-11-09 Thread aecc
Any help on this? this is really blocking me and I don't find any feasible solution yet. Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-task-hangs-infinitely-when-accessing-S3-from-AWS-tp25289p25327.html Sent from the Apache Spark User List

Spark SQL 'explode' command failing on AWS EC2 but succeeding locally

2015-11-06 Thread Anthony Rose
Hi all, I am using Spark SQL and I have a table stored in a Dataframe that I am trying to re-structure. I have an approach that works locally but when I try to run the same command on an AWS EC2 instance I get an error reporting that I have an 'unresolved operator' Basically I have data

Spark task hangs infinitely when accessing S3 from AWS

2015-11-05 Thread aecc
Hi guys, when reading data from S3 from AWS using Spark 1.5.1 one of the tasks hangs when reading data in a way that cannot be reproduced. Some times it hangs, some times it doesn't. This is the thread dump from the hung task: "Executor task launch worker-3" daemon prio=10 tid=0x7f

Re: spark read data from aws s3

2015-11-03 Thread hveiga
You also need to have library hadoop-aws in your classpath. From Hadoop 2.6, the AWS libraries come in that separate library. Also, you will need this line in your hadoop configuration: hadoopConf.set("fs.s3.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")

Re: newbie trouble submitting java app to AWS cluster I created using spark-ec2 script from spark-1.5.1-bin-hadoop2.6 distribution

2015-10-29 Thread Andy Davidson
Hi Robin and Sabarish I figure out what the problem To submit my java app so that it runs in cluster mode (ie. I can close my laptop and go home) I need to do the following 1. make sure my jar file is available on all the slaves. Spark-submit will cause my driver to run on a slave, It will not

newbie trouble submitting java app to AWS cluster I created using spark-ec2 script from spark-1.5.1-bin-hadoop2.6 distribution

2015-10-28 Thread Andy Davidson
Hi I just created new cluster using the spark-c2 script from the spark-1.5.1-bin-hadoop2.6 distribution. The master and slaves seem to be up and running. I am having a heck of time figuring out how to submit apps. As a test I compile the sample JavaSparkPi example. I have copied my jar file to

Re: newbie trouble submitting java app to AWS cluster I created using spark-ec2 script from spark-1.5.1-bin-hadoop2.6 distribution

2015-10-28 Thread Andy Davidson
ber 28, 2015 at 2:37 PM To: "user@spark.apache.org" <user@spark.apache.org> Subject: newbie trouble submitting java app to AWS cluster I created using spark-ec2 script from spark-1.5.1-bin-hadoop2.6 distribution > Hi > > > > I just created new cluster using the s

Re: Spark on YARN / aws - executor lost on node restart

2015-09-24 Thread Adrian Tanase
lto:user@spark.apache.org>" Subject: Re: Spark on YARN / aws - executor lost on node restart Hi guys, Digging up this question after spending some more time trying to replicate it. It seems to be an issue with the YARN – spark integration, wondering if there is a bug already tracking this?

Re: Spark on YARN / aws - executor lost on node restart

2015-09-18 Thread Adrian Tanase
dies completely? If there are no ideas on the list, I’ll prepare some logs and follow up with an issue. Thanks, -adrian From: Adrian Tanase Date: Wednesday, September 16, 2015 at 6:01 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: Spark on YARN / aws

Spark on YARN / aws - executor lost on node restart

2015-09-16 Thread Adrian Tanase
Hi all, We’re using spark streaming (1.4.0), deployed on AWS through yarn. It’s a stateful app that reads from kafka (with the new direct API) and we’re checkpointing to HDFS. During some resilience testing, we restarted one of the machines and brought it back online. During the offline

Re: How to access Spark UI through AWS

2015-08-25 Thread Kelly, Jonathan
seems broken), however the proxy continuously redirects me to the main page, so I cannot drill into anything. So, I tried static tunneling, but can't seem to get through. So, how can I access the spark UI when running a spark shell in AWS yarn? -- View this message in context: http://apache-spark

Re: How to access Spark UI through AWS

2015-08-25 Thread Justin Pihony
a crude UI (css seems broken), however the proxy continuously redirects me to the main page, so I cannot drill into anything. So, I tried static tunneling, but can't seem to get through. So, how can I access the spark UI when running a spark shell in AWS yarn? -- View this message

Re: How to access Spark UI through AWS

2015-08-25 Thread Justin Pihony
me to the main page, so I cannot drill into anything. So, I tried static tunneling, but can't seem to get through. So, how can I access the spark UI when running a spark shell in AWS yarn? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How

Re: How to access Spark UI through AWS

2015-08-25 Thread Justin Pihony
in AWS yarn? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-access-Spark-UI -through-AWS-tp24436.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: How to access Spark UI through AWS

2015-08-25 Thread Justin Pihony
a spark shell in AWS yarn? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-access-Spark-UI -through-AWS-tp24436.html Sent from the Apache Spark User List mailing list archive at Nabble.com

How to access Spark UI through AWS

2015-08-24 Thread Justin Pihony
. So, how can I access the spark UI when running a spark shell in AWS yarn? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-access-Spark-UI-through-AWS-tp24436.html Sent from the Apache Spark User List mailing list archive at Nabble.com

RE: Specifying the role when launching an AWS spark cluster using spark_ec2

2015-08-07 Thread Ewan Leith
You'll have a lot less hassle using the AWS EMR instances with Spark 1.4.1 for now, until the spark_ec2.py scripts move to Hadoop 2.7.1, at the moment I'm pretty sure it's only using Hadoop 2.4 The EMR setup with Spark lets you use s3:// URIs with IAM roles Ewan -Original Message

Re: Specifying the role when launching an AWS spark cluster using spark_ec2

2015-08-06 Thread Steve Loughran
know about the spark_ec2 scripts or what they start On 6 Aug 2015, at 10:27, SK skrishna...@gmail.com wrote: Hi, I need to access data on S3 from another account and I have been given the IAM role information to access that S3 bucket. From what I understand, AWS allows us to attach a role

RE: SparkR dataFrame read.df fails to read from aws s3

2015-07-09 Thread Sun, Rui
read.df fails to read from aws s3 I have Spark 1.4 deployed on AWS EMR but methods of SparkR dataFrame read.df method cannot load data from aws s3. 1) read.df error message read.df(sqlContext,s3://some-bucket/some.json,json) 15/07/09 04:07:01 ERROR r.RBackendHandler: loadDF

SparkR dataFrame read.df fails to read from aws s3

2015-07-08 Thread Ben Spark
I have Spark 1.4 deployed on AWS EMR but methods of SparkR dataFrame read.df method cannot load data from aws s3. 1) read.df error message read.df(sqlContext,s3://some-bucket/some.json,json) 15/07/09 04:07:01 ERROR r.RBackendHandler: loadDF on org.apache.spark.sql.api.r.SQLUtils failed

RE: [SPARK-6330] 1.4.0/1.5.0 Bug to access S3 -- AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyI

2015-06-10 Thread Shuai Zheng
S3 -- AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively) That exception is a bit weird as it refers to fs.s3 instead of fs.s3n. Maybe

[SPARK-6330] 1.4.0/1.5.0 Bug to access S3 -- AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or

2015-06-09 Thread Shuai Zheng
below exception: Exception in thread main java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively). I don't

Re: Adding new Spark workers on AWS EC2 - access error

2015-06-04 Thread barmaley
. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Adding-new-Spark-workers-on-AWS-EC2-access-error-tp23143p23155.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Adding new Spark workers on AWS EC2 - access error

2015-06-04 Thread Akhil Das
, barmaley o...@solver.com wrote: I have the existing operating Spark cluster that was launched with spark-ec2 script. I'm trying to add new slave by following the instructions: Stop the cluster On AWS console launch more like this on one of the slaves Start the cluster Although the new

Adding new Spark workers on AWS EC2 - access error

2015-06-03 Thread barmaley
I have the existing operating Spark cluster that was launched with spark-ec2 script. I'm trying to add new slave by following the instructions: Stop the cluster On AWS console launch more like this on one of the slaves Start the cluster Although the new instance is added to the same security

RE: Running Spark/YARN on AWS EMR - Issues finding file on hdfs?

2015-05-16 Thread jaredtims
Any resolution to this? Im having the same problem. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-Spark-YARN-on-AWS-EMR-Issues-finding-file-on-hdfs-tp10214p22918.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: AWS-Credentials fails with org.apache.hadoop.fs.s3.S3Exception: FORBIDDEN

2015-05-08 Thread Akhil Das
Guys, I think this problem is related to : http://apache-spark-user-list.1001560.n3.nabble.com/AWS-Credentials-for-private-S3-reads-td8689.html I am running pyspark 1.2.1 in AWS with my AWS credentials exported to master node as Environmental Variables. Halfway through my application, I get

Re: AWS-Credentials fails with org.apache.hadoop.fs.s3.S3Exception: FORBIDDEN

2015-05-08 Thread in4maniac
0 found in Row type was thrown. The error was kinda misleading that I kindof oversaw this logical error in my code. Just thought should keep this posted. -in4 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/AWS-Credentials-fails-with-org-apache-hadoop-fs

AWS-Credentials fails with org.apache.hadoop.fs.s3.S3Exception: FORBIDDEN

2015-05-07 Thread in4maniac
Hi Guys, I think this problem is related to : http://apache-spark-user-list.1001560.n3.nabble.com/AWS-Credentials-for-private-S3-reads-td8689.html I am running pyspark 1.2.1 in AWS with my AWS credentials exported to master node as Environmental Variables. Halfway through my application, I

Re: Issue on Spark SQL insert or create table with Spark running on AWS EMR -- s3n.S3NativeFileSystem: rename never finished

2015-04-02 Thread Wollert, Fabian
from an too old Spark Hive Version, which was used to compile the Spark Version you build in your github project or other recency problems. We suggest recompiling your Spark Version with the AWS Hive Version, which has the Hive adaptions you mentioned already implemented. Or what do you think? Cheers

Issue on Spark SQL insert or create table with Spark running on AWS EMR -- s3n.S3NativeFileSystem: rename never finished

2015-04-01 Thread chutium
-or-create-table-with-Spark-running-on-AWS-EMR-s3n-S3NativeFileSystem-renamd-tp22340.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

Re: Accessing AWS S3 in Frankfurt (v4 only - AWS4-HMAC-SHA256)

2015-03-21 Thread Steve Loughran
1. make sure your secret key doesn't have a / in it. If it does, generate a new key. 2. jets3t and hadoop JAR versions need to be in sync; jets3t 0.9.0 was picked up in Hadoop 2.4 and not AFAIK 3. Hadoop 2.6 has a new S3 client, s3a, which compatible with s3n data. It uses the AWS toolkit

Re: Accessing AWS S3 in Frankfurt (v4 only - AWS4-HMAC-SHA256)

2015-03-20 Thread Gourav Sengupta
Hi Ralf, using secret keys and authorization details is a strict NO for AWS, they are major security lapses and should be avoided at any cost. Have you tried starting the clusters using ROLES, they are wonderful way to start clusters or EC2 nodes and you do not have to copy and paste any

Accessing AWS S3 in Frankfurt (v4 only - AWS4-HMAC-SHA256)

2015-03-20 Thread Ralf Heyde
Hey, We want to run a Job, accessing S3, from EC2 instances. The Job runs in a self-provided Spark Cluster (1.3.0) on EC2 instances. In Irland everything works as expected. i just tried to move data from Irland - Frankfurt. AWS S3 is forcing v4 of their API there, means: access is only possible

Re: Accessing AWS S3 in Frankfurt (v4 only - AWS4-HMAC-SHA256)

2015-03-20 Thread Ralf Heyde
Good Idea, will try that. But assuming, only data is located there, the problem will still occur. On Fri, Mar 20, 2015 at 3:08 PM, Gourav Sengupta gourav.sengu...@gmail.com wrote: Hi Ralf, using secret keys and authorization details is a strict NO for AWS, they are major security lapses

AWS SDK HttpClient version conflict (spark.files.userClassPathFirst not working)

2015-03-12 Thread Adam Lewandowski
I'm trying to use the AWS SDK (v1.9.23) to connect to DynamoDB from within a Spark application. Spark 1.2.1 is assembled with HttpClient 4.2.6, but the AWS SDK is depending on HttpClient 4.3.4 for it's communication with DynamoDB. The end result is an error when the app tries to connect

Re: AWS SDK HttpClient version conflict (spark.files.userClassPathFirst not working)

2015-03-12 Thread 浦野 裕也
Hi Adam, Could you try building spark with profile -Pkinesis-asl. mvn -Pkinesis-asl -DskipTests clean package refers to 'Running the Example' section. https://spark.apache.org/docs/latest/streaming-kinesis-integration.html In fact, I've seen same issue and have been able to use the AWS SDK

Pushing data from AWS Kinesis - Spark Streaming - AWS Redshift

2015-03-01 Thread Mike Trienis
Hi All, I am looking at integrating a data stream from AWS Kinesis to AWS Redshift and since I am already ingesting the data through Spark Streaming, it seems convenient to also push that data to AWS Redshift at the same time. I have taken a look at the AWS kinesis connector although I am

Re: Pushing data from AWS Kinesis - Spark Streaming - AWS Redshift

2015-03-01 Thread Chris Fregly
Hey Mike- Great to see you're using the AWS stack to its fullest! I've already created the Kinesis-Spark Streaming connector with examples, documentation, test, and everything. You'll need to build Spark from source with the -Pkinesis-asl profile, otherwise they won't be included in the build

Re: How to create spark AMI in AWS

2015-02-09 Thread Guodong Wang
Linux AMI. Yes, the create_image.sh script is what is used to generate the current Spark AMI. Nick On Mon Feb 09 2015 at 3:27:13 AM Franc Carter franc.car...@rozettatech.com wrote: Hi, I'm very new to Spark, but experienced with AWS - so take that in to account with my suggestions

Re: How to create spark AMI in AWS

2015-02-09 Thread Nicholas Chammas
very new to Spark, but experienced with AWS - so take that in to account with my suggestions. I started with an AWS base image and then added the pre-built Spark-1.2. I then added made a 'Master' version and a 'Worker' versions and then made AMIs for them. The Master comes up with a static

Re: How to create spark AMI in AWS

2015-02-09 Thread Nicholas Chammas
know is that it is an Amazon Linux AMI. Yes, the create_image.sh script is what is used to generate the current Spark AMI. Nick On Mon Feb 09 2015 at 3:27:13 AM Franc Carter franc.car...@rozettatech.com wrote: Hi, I'm very new to Spark, but experienced with AWS - so take that in to account

How to create spark AMI in AWS

2015-02-09 Thread Guodong Wang
Hi guys, I want to launch spark cluster in AWS. And I know there is a spark_ec2.py script. I am using the AWS service in China. But I can not find the AMI in the region of China. So, I have to build one. My question is 1. Where is the bootstrap script to create the Spark AMI? Is it here( https

Re: Analyzing data from non-standard data sources (e.g. AWS Redshift)

2015-01-25 Thread Denis Mikhalkin
Chammas nicholas.cham...@gmail.com To: Denis Mikhalkin deni...@yahoo.com; user@spark.apache.org user@spark.apache.org Sent: Sunday, 25 January 2015, 3:06 Subject: Re: Analyzing data from non-standard data sources (e.g. AWS Redshift) I believe databricks provides an rdd interface to redshift. Did

Re: Analyzing data from non-standard data sources (e.g. AWS Redshift)

2015-01-25 Thread Charles Feduke
*Subject:* Re: Analyzing data from non-standard data sources (e.g. AWS Redshift) I believe databricks provides an rdd interface to redshift. Did you check spark-packages.org? On 2015년 1월 24일 (토) at 오전 6:45 Denis Mikhalkin deni...@yahoo.com.invalid wrote: Hello, we've got some analytics data

Re: Analyzing data from non-standard data sources (e.g. AWS Redshift)

2015-01-25 Thread Charles Feduke
, 25 January 2015, 3:06 *Subject:* Re: Analyzing data from non-standard data sources (e.g. AWS Redshift) I believe databricks provides an rdd interface to redshift. Did you check spark-packages.org? On 2015년 1월 24일 (토) at 오전 6:45 Denis Mikhalkin deni...@yahoo.com.invalid wrote: Hello

Re: Analyzing data from non-standard data sources (e.g. AWS Redshift)

2015-01-24 Thread Nicholas Chammas
I believe databricks provides an rdd interface to redshift. Did you check spark-packages.org? On 2015년 1월 24일 (토) at 오전 6:45 Denis Mikhalkin deni...@yahoo.com.invalid wrote: Hello, we've got some analytics data in AWS Redshift. The data is being constantly updated. I'd like to be able

Analyzing data from non-standard data sources (e.g. AWS Redshift)

2015-01-24 Thread Denis Mikhalkin
Hello, we've got some analytics data in AWS Redshift. The data is being constantly updated. I'd like to be able to write a query against Redshift which would return a subset of data, and then run a Spark job (Pyspark) to do some analysis. I could not find an RDD which would let me do it OOB

com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain

2015-01-19 Thread Hafiz Mujadid
Hi all! I am trying to use kinesis and spark streaming together. So when I execute program I get exception com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain Here is my piece of code val credentials = new BasicAWSCredentials

Re: com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain

2015-01-19 Thread Akhil Das
20, 2015 at 12:51 PM, Hafiz Mujadid hafizmujadi...@gmail.com wrote: Hi all! I am trying to use kinesis and spark streaming together. So when I execute program I get exception com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain Here is my piece

Re: PermGen issues on AWS

2015-01-09 Thread Joe Wass
generation, so this particular type of problem and tuning is not needed. You might consider running on Java 8. On Fri, Jan 9, 2015 at 10:38 AM, Joe Wass jw...@crossref.org wrote: I'm running on an AWS cluster of 10 x m1.large (64 bit, 7.5 GiB RAM). FWIW I'm using the Flambo Clojure wrapper which

Re: PermGen issues on AWS

2015-01-09 Thread Sean Owen
. Also, Java 8 no longer has a permanent generation, so this particular type of problem and tuning is not needed. You might consider running on Java 8. On Fri, Jan 9, 2015 at 10:38 AM, Joe Wass jw...@crossref.org wrote: I'm running on an AWS cluster of 10 x m1.large (64 bit, 7.5 GiB RAM). FWIW

PermGen issues on AWS

2015-01-09 Thread Joe Wass
I'm running on an AWS cluster of 10 x m1.large (64 bit, 7.5 GiB RAM). FWIW I'm using the Flambo Clojure wrapper which uses the Java API but I don't think that should make any difference. I'm running with the following command: spark/bin/spark-submit --class mything.core --name My Thing --conf

Are failures normal / to be expected on an AWS cluster?

2014-12-20 Thread Joe Wass
, but CANNOT FIND ADDRESS for half of the executors. Are these numbers normal for AWS? Should a certain number of faults be expected? I know that AWS isn't meant to be perfect, but this doesn't seem right. Cheers Joe

spark_ec2.py for AWS region: cn-north-1, China

2014-11-04 Thread haitao .yao
Hi, Amazon aws started to provide service for China mainland, the region name is cn-north-1. But the script spark provides: spark_ec2.py will query ami id from https://github.com/mesos/spark-ec2/tree/v4/ami-list and there's no ami information for cn-north-1 region . Can anybody update

Re: spark_ec2.py for AWS region: cn-north-1, China

2014-11-04 Thread Nicholas Chammas
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html cn-north-1 is not a supported region for EC2, as far as I can tell. There may be other AWS services that can use that region, but spark-ec2 relies on EC2. Nick On Tue, Nov 4, 2014 at 8:09 PM, haitao .yao

Re: spark_ec2.py for AWS region: cn-north-1, China

2014-11-04 Thread haitao .yao
://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html cn-north-1 is not a supported region for EC2, as far as I can tell. There may be other AWS services that can use that region, but spark-ec2 relies on EC2. Nick On Tue, Nov 4, 2014 at 8:09 PM, haitao .yao yao.e

Re: spark_ec2.py for AWS region: cn-north-1, China

2014-11-04 Thread Nicholas Chammas
nicholas.cham...@gmail.com: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html cn-north-1 is not a supported region for EC2, as far as I can tell. There may be other AWS services that can use that region, but spark-ec2 relies on EC2. Nick On Tue, Nov

Re: spark_ec2.py for AWS region: cn-north-1, China

2014-11-04 Thread haitao .yao
is not a supported region for EC2, as far as I can tell. There may be other AWS services that can use that region, but spark-ec2 relies on EC2. Nick On Tue, Nov 4, 2014 at 8:09 PM, haitao .yao yao.e...@gmail.com wrote: Hi, Amazon aws started to provide service for China mainland, the region name

RE: Running Spark/YARN on AWS EMR - Issues finding file on hdfs?

2014-10-14 Thread neeraj
I'm trying to get some workaround for this issue. Thanks and Regards, Neeraj Garg From: H4ml3t [via Apache Spark User List] [mailto:ml-node+s1001560n16379...@n3.nabble.com] Sent: Tuesday, October 14, 2014 6:53 PM To: Neeraj Garg02 Subject: Re: Running Spark/YARN on AWS EMR - Issues finding file

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

2014-10-08 Thread mrm
They reverted to a previous version of the spark-ec2 script and things are working again! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-HDFS-doesn-t-start-on-AWS-EC2-cluster-tp15921p15945.html Sent from the Apache Spark User List mailing list

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

2014-10-08 Thread Nicholas Chammas
and things are working again! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-HDFS-doesn-t-start-on-AWS-EC2-cluster-tp15921p15945.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

2014-10-08 Thread Jan Warchoł
-start-on-AWS-EC2-cluster-tp15921p15945.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h

Re: spark-ec2 - HDFS doesn't start on AWS EC2 cluster

2014-10-08 Thread Akhil Das
: They reverted to a previous version of the spark-ec2 script and things are working again! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-ec2-HDFS-doesn-t-start-on-AWS-EC2-cluster-tp15921p15945.html Sent from the Apache Spark User List mailing list archive

Spark 1.1.0 (w/ hadoop 2.4) versus aws-java-sdk-1.7.2.jar

2014-09-19 Thread tian zhang
Hi, Spark experts, I have the following issue when using aws java sdk in my spark application. Here I narrowed down the following steps to reproduce the problem 1) I have Spark 1.1.0 with hadoop 2.4 installed on 3 nodes cluster 2) from the master node, I did the following steps. spark-shell

Got error “java.lang.IllegalAccessError when using HiveContext in Spark shell on AWS

2014-08-07 Thread Zhun Shen
Hi, When I try to use HiveContext in Spark shell on AWS, I got the error java.lang.IllegalAccessError: tried to access method com.google.common.collect.MapMaker.makeComputingMap(Lcom/google/common/base/Function;)Ljava/util/concurrent/ConcurrentMap. I follow the steps below to compile and install

Re: Got error “java.lang.IllegalAccessError when using HiveContext in Spark shell on AWS

2014-08-07 Thread Cheng Lian
Hey Zhun, Thanks for the detailed problem description. Please see my comments inlined below. On Thu, Aug 7, 2014 at 6:18 PM, Zhun Shen shenzhunal...@gmail.com wrote: Caused by: java.lang.IllegalAccessError: tried to access method

Re: Bad Digest error while doing aws s3 put

2014-08-07 Thread lmk
://apache-spark-user-list.1001560.n3.nabble.com/Bad-Digest-error-while-doing-aws-s3-put-tp10036p11642.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

Re: Got error “java.lang.IllegalAccessError when using HiveContext in Spark shell on AWS

2014-08-07 Thread Zhun Shen
Hi Cheng, I replaced Guava 15.0 with Guava 14.0.1 in my spark classpath, the problem was solved. So your method is correct. It proved that this issue was caused by AWS EMR (ami-version 3.1.0) libs which include Guava 15.0. Many thanks and see you in the first Spark User Beijing Meetup tomorrow

Re: Bad Digest error while doing aws s3 put

2014-08-05 Thread lmk
: http://apache-spark-user-list.1001560.n3.nabble.com/Bad-Digest-error-while-doing-aws-s3-put-tp10036p11421.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr

Re: Bad Digest error while doing aws s3 put

2014-08-04 Thread lmk
with this problem for the past couple of weeks. Thanks, lmk -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Bad-Digest-error-while-doing-aws-s3-put-tp10036p11345.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Bad Digest error while doing aws s3 put

2014-07-28 Thread lmk
-while-doing-aws-s3-put-tp10036p10780.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Running Spark/YARN on AWS EMR - Issues finding file on hdfs?

2014-07-18 Thread _soumya_
between each other so this can't be a port issue. What am I missing? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-Spark-YARN-on-AWS-EMR-Issues-finding-file-on-hdfs-tp10214.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Bad Digest error while doing aws s3 put

2014-07-17 Thread lmk
to md5 checksum mismatch. But will this happen due to load? Regards, lmk -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Bad-Digest-error-while-doing-aws-s3-put-tp10036.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

AWS Credentials for private S3 reads

2014-07-02 Thread Brian Gawalt
dependency to sbt of: org.apache.spark % spark-core_2.10 % 1.0.0 Any tips appeciated! Thanks much, -Brian -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/AWS-Credentials-for-private-S3-reads-tp8687.html Sent from the Apache Spark User List mailing list archive

AWS Credentials for private S3 reads

2014-07-02 Thread Brian Gawalt
of: org.apache.spark % spark-core_2.10 % 1.0.0 Any tips appeciated! Thanks much, -Brian -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/AWS-Credentials-for-private-S3-reads-tp8689.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: AWS Credentials for private S3 reads

2014-07-02 Thread Matei Zaharia
a library dependency to sbt of: org.apache.spark % spark-core_2.10 % 1.0.0 Any tips appeciated! Thanks much, -Brian -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/AWS-Credentials-for-private-S3-reads-tp8687.html Sent from the Apache Spark User List

Debugging Spark AWS S3

2014-05-16 Thread Robert James
I have Spark code which runs beautifully when MASTER=local. When I run it with MASTER set to a spark ec2 cluster, the workers seem to run, but the results, which are supposed to be put to AWS S3, don't appear on S3. I'm at a loss for how to debug this. I don't see any S3 exceptions anywhere

Re: Debugging Spark AWS S3

2014-05-16 Thread Ian Ferreira
Did you check the executor stderr logs? On 5/16/14, 2:37 PM, Robert James srobertja...@gmail.com wrote: I have Spark code which runs beautifully when MASTER=local. When I run it with MASTER set to a spark ec2 cluster, the workers seem to run, but the results, which are supposed to be put to AWS

Re: AWS Spark-ec2 script with different user

2014-04-09 Thread Nicholas Chammas
execute it from the 'ec2' directory of spark, as usual. The AMI used is the raw one from the AWS Quick Start section. It is the first option (an Amazon Linux paravirtual image). Any ideas or confirmation would be GREATLY appreciated. Please and thank you. #!/bin/sh export AWS_ACCESS_KEY_ID

Re: AWS Spark-ec2 script with different user

2014-04-09 Thread Nicholas Chammas
one from the AWS Quick Start section. It is the first option (an Amazon Linux paravirtual image). Any ideas or confirmation would be GREATLY appreciated. Please and thank you. #!/bin/sh export AWS_ACCESS_KEY_ID=MyCensoredKey export AWS_SECRET_ACCESS_KEY=MyCensoredKey AMI_ID=ami-2f726546

Re: AWS Spark-ec2 script with different user

2014-04-09 Thread Marco Costantini
is the script I am running. It is a simple shell script which calls spark-ec2 wrapper script. I execute it from the 'ec2' directory of spark, as usual. The AMI used is the raw one from the AWS Quick Start section. It is the first option (an Amazon Linux paravirtual image). Any ideas

Re: AWS Spark-ec2 script with different user

2014-04-09 Thread Shivaram Venkataraman
' directory of spark, as usual. The AMI used is the raw one from the AWS Quick Start section. It is the first option (an Amazon Linux paravirtual image). Any ideas or confirmation would be GREATLY appreciated. Please and thank you. #!/bin/sh export AWS_ACCESS_KEY_ID=MyCensoredKey export

Re: AWS Spark-ec2 script with different user

2014-04-08 Thread Marco Costantini
calls spark-ec2 wrapper script. I execute it from the 'ec2' directory of spark, as usual. The AMI used is the raw one from the AWS Quick Start section. It is the first option (an Amazon Linux paravirtual image). Any ideas or confirmation would be GREATLY appreciated. Please and thank you

Re: AWS Spark-ec2 script with different user

2014-04-08 Thread Marco Costantini
script. I execute it from the 'ec2' directory of spark, as usual. The AMI used is the raw one from the AWS Quick Start section. It is the first option (an Amazon Linux paravirtual image). Any ideas or confirmation would be GREATLY appreciated. Please and thank you. #!/bin/sh export

AWS Spark-ec2 script with different user

2014-04-07 Thread Marco Costantini
Hi all, On the old Amazon Linux EC2 images, the user 'root' was enabled for ssh. Also, it is the default user for the Spark-EC2 script. Currently, the Amazon Linux images have an 'ec2-user' set up for ssh instead of 'root'. I can see that the Spark-EC2 script allows you to specify which user to

Re: AWS Spark-ec2 script with different user

2014-04-07 Thread Marco Costantini
Hi Shivaram, OK so let's assume the script CANNOT take a different user and that it must be 'root'. The typical workaround is as you said, allow the ssh with the root user. Now, don't laugh, but, this worked last Friday, but today (Monday) it no longer works. :D Why? ... ...It seems that NOW,

Re: AWS Spark-ec2 script with different user

2014-04-07 Thread Shivaram Venkataraman
Hmm -- That is strange. Can you paste the command you are using to launch the instances ? The typical workflow is to use the spark-ec2 wrapper script using the guidelines at http://spark.apache.org/docs/latest/ec2-scripts.html Shivaram On Mon, Apr 7, 2014 at 1:53 PM, Marco Costantini

<    1   2   3