Experience with centralised logging for Spark?

2015-07-03 Thread Edward Sargisson
Hi all, I'm wondering if anybody as any experience with centralised logging for Spark - or even has felt that there was need for this given the WebUI. At my organization we use Log4j2 and Flume as the front end of our centralised logging system. I was looking into modifying Spark to use that

Application on standalone cluster never changes state to be stopped

2015-05-22 Thread Edward Sargisson
Hi, Environment: Spark standalone cluster running with a master and a work on a small Vagrant VM. The Jetty Webapp on the same node calls the spark-submit script to start the job. From the contents of the stdout I can see that it's running successfully. However, the spark-submit process never

Fwd: Re: spark 1.3.1 jars in repo1.maven.org

2015-05-20 Thread Edward Sargisson
libraries are available in the classloader from Spark and don't clash with existing libraries we have. More anon, Cheers, Edward Original Message Subject: Re: spark 1.3.1 jars in repo1.maven.org Date: 2015-05-20 00:38 From: Sean Owen so...@cloudera.com To: Edward Sargisson esa

spark 1.3.1 jars in repo1.maven.org

2015-05-19 Thread Edward Sargisson
Hi, I'd like to confirm an observation I've just made. Specifically that spark is only available in repo1.maven.org for one Hadoop variant. The Spark source can be compiled against a number of different Hadoops using profiles. Yay. However, the spark jars in repo1.maven.org appear to be compiled

How do you use the thrift-server to get data from a Spark program?

2014-10-26 Thread Edward Sargisson
Hi all, This feels like a dumb question but bespeaks my lack of understanding: what is the Spark thrift-server for? Especially if there's an existing Hive installation. Background: We want to use Spark to do some processing starting from files (in probably MapRFS). We want to be able to read the