Re: Java Recipes for Spark

2016-07-29 Thread Renato Perini
Not only very useful, but finally some Java love :-) Thank you. Il 29/07/2016 22:30, Jean Georges Perrin ha scritto: Sorry if this looks like a shameless self promotion, but some of you asked me to say when I'll have my Java recipes for Apache Spark updated. It's done here:

Free memory while launching jobs.

2016-05-03 Thread Renato Perini
I have a machine with an 8GB total memory, on which there are other applications installed. The Spark application must run 1 driver and two jobs at a time. I have configured 8 cores in total. The machine (without Spark) has 4GB of free RAM (the other half RAM is used by other applications).

Re: Spark on AWS

2016-04-28 Thread Renato Perini
month in total. Renato Perini. Il 28/04/2016 23:39, Fatma Ozcan ha scritto: What is your experience using Spark on AWS? Are you setting up your own Spark cluster, and using HDFS? Or are you using Spark as a service from AWS? In the latter case, what is your experience of using S3 directly, without

Apache Cassandra Docker Images?

2015-11-23 Thread Renato Perini
Hello, any planned support for official Docker images? Would be great having some images using the cluster manager of choice (Standalone, Yarn, Mesos) with the latest Apache Spark distribution (ideally, using CentOS 7.x) for clusterizable containers. Regards, Renato Perini

spark-shell (1.5.1) not starting cleanly on Windows.

2015-10-20 Thread Renato Perini
I'm using Spark on Windows 10 (64 bit) and jdk8u60 When I start spark-shell, I obtain many warnings and exceptions. The system complains about already registered datanucleus plugins, emit exceptions and so on: C:\test>spark-shell log4j:WARN No appenders could be found for logger

Spark 1.5.1 ClassNotFoundException in cluster mode.

2015-10-14 Thread Renato Perini
Hello. I have developed a Spark job using a jersey client (1.9 included with Spark) to make some service calls during data computations. Data is read and written on an Apache Cassandra 2.2.1 database. When I run the job in local mode, everything works nicely. But when I execute my job in

Architecture for a Spark batch job.

2015-10-08 Thread Renato Perini
I have started a project using Spark 1.5.1 consisting of several jobs I launch (actually manually) using shell scripts against a small Spark standalone cluster. Those jobs generally read a Cassandra table (using a RDD of type JavaRDD or using plain DataFrames), compute results on that data and

Re: spark-ec2 config files.

2015-10-05 Thread Renato Perini
is referring to the 1.5 branch. Is this what you are looking for? Jeff On Mon, Oct 5, 2015 at 8:56 AM, Renato Perini <renato.per...@gmail.com <mailto:renato.per...@gmail.com>> wrote: Can someone provide the relevant config files generated by Spark EC2 script? I'm configu

spark-ec2 config files.

2015-10-04 Thread Renato Perini
Can someone provide the relevant config files generated by Spark EC2 script? I'm configuring a Spark cluster on EC2 manually, and I would like to compare my config files (spark-defaults.conf, spark-env.sh) with those generated by the spark-ec2 script. Of course, hide your sensitive

Spark Web UI + NGINX

2015-09-17 Thread Renato Perini
Hello! I'm trying to set up a reverse proxy (using nginx) for the Spark Web UI. I have 2 machines: 1) Machine A, with a public IP. This machine will be used to access Spark Web UI on the Machine B through its private IP address. 2) Machine B, where Spark is installed (standalone master