[POWERED BY] Please add our organization

2015-09-23 Thread barmaley
Name: Frontline Systems Inc. URL: www.solver.com Description: • We built an interface between Microsoft Excel and Apache Spark - bringing Big Data from the clusters to Excel enabling tools ranging from simple charts and Power View dashboards to add-ins for machine learning and predictive

Akka failures: Driver Disassociated

2015-06-25 Thread barmaley
I'm running Spark 1.3.1 on AWS... Having long-running application (spark context) which accepts and completes jobs fine. However, it crashes at as it seems random times (anywhere from 1 hour and up to 6 days)... At a latter case, context run and finished hundreds of jobs without an issue and then

takeSample() results in two stages

2015-06-11 Thread barmaley
I've observed interesting behavior in Spark 1.3.1, the reason for which is not clear. Doing something as simple as sc.textFile(...).takeSample(...) always results in two stages:Spark's takeSample() results in two stages http://apache-spark-user-list.1001560.n3.nabble.com/file/n23280/Capture.jpg

Can't access Ganglia on EC2 Spark cluster

2015-06-10 Thread barmaley
Launching using spark-ec2 script results in: Setting up ganglia RSYNC'ing /etc/ganglia to slaves... ... Shutting down GANGLIA gmond: [FAILED] Starting GANGLIA gmond:[ OK ] Shutting down GANGLIA gmond:

Re: Adding new Spark workers on AWS EC2 - access error

2015-06-04 Thread barmaley
The issue was that SSH key generated on Spark Master was not transferred to this new slave. Spark-ec2 script with `start` command omits this step. The solution is to use `launch` command with `--resume` options. Then the SSH key is transferred to the new slave and everything goes smooth. --

Re: Required settings for permanent HDFS Spark on EC2

2015-06-04 Thread barmaley
Hi - I'm having similar problem with switching from ephemeral to persistent HDFS - it always looks for 9000 port regardless of options I set for 9010 persistent HDFS. Have you figured out a solution? Thanks -- View this message in context:

Adding new Spark workers on AWS EC2 - access error

2015-06-03 Thread barmaley
I have the existing operating Spark cluster that was launched with spark-ec2 script. I'm trying to add new slave by following the instructions: Stop the cluster On AWS console launch more like this on one of the slaves Start the cluster Although the new instance is added to the same security

Spark SQL: STDDEV working in Spark Shell but not in a standalone app

2015-05-08 Thread barmaley
Given a registered table from data frame, I'm able to execute queries like sqlContext.sql(SELECT STDDEV(col1) FROM table) from Spark Shell just fine. However, when I run exactly the same code in a standalone app on a cluster, it throws an exception: java.util.NoSuchElementException: key not found: