from:"Ian Ferreira"

Unsubscribe

2014-10-27 Thread Ian Ferreira

unsubscribe

Is Hadoop MR now comparable with Spark?

2014-06-02 Thread Ian Ferreira

http://hortonworks.com/blog/ddm/#.U4yn3gJgfts.twitter

RE: Announcing Spark 1.0.0

2014-05-30 Thread Ian Ferreira

Congrats Sent from my Windows Phone From: Dean Wamplermailto:deanwamp...@gmail.com Sent: ‎5/‎30/‎2014 6:53 AM To: user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: Announcing Spark 1.0.0 Congratulations!! On Fri, May 30, 2014 at 5:12 AM, Patrick

Re: Debugging Spark AWS S3

2014-05-16 Thread Ian Ferreira

Did you check the executor stderr logs? On 5/16/14, 2:37 PM, Robert James srobertja...@gmail.com wrote: I have Spark code which runs beautifully when MASTER=local. When I run it with MASTER set to a spark ec2 cluster, the workers seem to run, but the results, which are supposed to be put to AWS

Re: Easy one

2014-05-07 Thread Ian Ferreira

in spark-env.sh on the workers as export SPARK_WORKER_MEMORY=4g On Tue, May 6, 2014 at 5:29 PM, Ian Ferreira ianferre...@hotmail.com wrote: Hi there, Why can¹t I seem to kick the executor memory higher? See below from EC2 deployment using m1.large And in the spark-env.sh export

Easy one

2014-05-06 Thread Ian Ferreira

Hi there, Why can¹t I seem to kick the executor memory higher? See below from EC2 deployment using m1.large And in the spark-env.sh export SPARK_MEM=6154m And in the spark context sconf.setExecutorEnv(spark.executor.memory, 4g²) Cheers - Ian

Re: Can't be built on MAC

2014-05-01 Thread Ian Ferreira

HI Zhige, I had the same issue and revert to using JDK 1.7.055 From: Zhige Xin xinzhi...@gmail.com Reply-To: user@spark.apache.org Date: Thursday, May 1, 2014 at 12:32 PM To: user@spark.apache.org Subject: Can't be built on MAC Hi dear all, When I tried to build Spark 0.9.1 on my Mac OS X

Setting the Scala version in the EC2 script?

2014-05-01 Thread Ian Ferreira

Is this possible, it is very annoying to have such a great script, but still have to manually update stuff afterwards.

Getting the following error using EC2 deployment

2014-05-01 Thread Ian Ferreira

I have a custom app that was compiled with scala 2.10.3 which I believe is what the latest spark-ec2 script installs. However running it on the master yields this cryptic error which according to the web implies incompatible jar versions. Exception in thread main java.lang.NoClassDefFoundError:

Running parallel jobs in the same driver with Futures?

2014-04-28 Thread Ian Ferreira

I recall asking about this, and I think Matei suggest it was, but is the scheduler thread safe? I am running mllib libraries as futures in the same driver using the same dataset as input and this error 14/04/28 08:29:48 ERROR TaskSchedulerImpl: Exception in statusUpdate

Failed to run count?

2014-04-23 Thread Ian Ferreira

I am getting this cryptic error running LinearRegressionwithSGD Data sample LabeledPoint(39.0, [144.0, 1521.0, 20736.0, 59319.0, 2985984.0]) 14/04/23 15:15:34 INFO SparkContext: Starting job: first at GeneralizedLinearAlgorithm.scala:121 14/04/23 15:15:34 INFO DAGScheduler: Got job 2 (first at

Adding to an RDD

2014-04-21 Thread Ian Ferreira

Feels like a silly questions, But what if I wanted to apply a map to each element in a RDD, but instead of replacing it, I wanted to add new columns of the manipulate value I.e. res0: Array[String] = Array(1 2, 1 3, 1 4, 2 1, 3 1, 4 1) Becomes res0: Array[String] = Array(1 2 2 4, 1 3 1 6,

Combining RDD's columns

2014-04-18 Thread Ian Ferreira

This may seem contrived but, suppose I wanted to create a collection of single column RDD's that contain calculated values, so I want to cache these to avoid re-calc. i.e. rdd1 = {Names] rdd2 = {Star Sign} rdd3 = {Age} Then I want to create a new virtual RDD that is a collection of these

Re: Scala vs Python performance differences

2014-04-15 Thread Ian Ferreira

This would be super useful. Thanks. On 4/15/14, 1:30 AM, Jeremy Freeman freeman.jer...@gmail.com wrote: Hi Andrew, I'm putting together some benchmarks for PySpark vs Scala. I'm focusing on ML algorithms, as I'm particularly curious about the relative performance of MLlib in Scala vs the Python

Multi-tenant?

2014-04-15 Thread Ian Ferreira

What is the support for multi-tenancy in Spark. I assume more than one driver can share the same cluster, but can a driver run two jobs in parallel?

RE: Multi-tenant?

2014-04-15 Thread Ian Ferreira

://spark.apache.org/docs/latest/job-scheduling.html, which includes scheduling concurrent jobs within the same driver. Matei On Apr 15, 2014, at 4:08 PM, Ian Ferreira ianferre...@hotmail.com wrote: What is the support for multi-tenancy in Spark. I assume more than one driver can share the same cluster

Re: Spark resilience

2014-04-14 Thread Ian Ferreira

resources, but does not affect currently-running jobs. Workers can fail and will simply cause jobs to lose their current Executors. New Workers can be added at any point. On Mon, Apr 14, 2014 at 11:00 AM, Ian Ferreira ianferre...@hotmail.com wrote: Folks, I was wondering what the failure support

Pyspark with Cython

2014-04-14 Thread Ian Ferreira

Has anyone used Cython closures with Spark? We have a large investment in Python code that we don¹t want to port to Scala. Curious about any performance issues with the interop between the Scala engine and the Cython closures. I believe it is sockets on the driver and pipe on the executors?

Re: Error when run Spark on mesos

2014-04-02 Thread Ian Ferreira

I think this is related to a known issue (regression) in 0.9.0. Try using explicit IP other than loop back. Sent from a mobile device On Apr 2, 2014, at 8:53 PM, panfei cnwe...@gmail.com wrote: any advice ? 2014-04-03 11:35 GMT+08:00 felix cnwe...@gmail.com: I deployed mesos and test

Unsubscribe

Is Hadoop MR now comparable with Spark?

RE: Announcing Spark 1.0.0

Re: Debugging Spark AWS S3

Re: Easy one

Easy one

Re: Can't be built on MAC

Setting the Scala version in the EC2 script?

Getting the following error using EC2 deployment

Running parallel jobs in the same driver with Futures?

Failed to run count?

Adding to an RDD

Combining RDD's columns

Re: Scala vs Python performance differences

Multi-tenant?

RE: Multi-tenant?

Re: Spark resilience

Pyspark with Cython

Re: Error when run Spark on mesos

19 matches

Site Navigation

Mail list logo

Footer information