Word Count on Mesos Cluster

2014-03-05 Thread juanpedromoreno
Hi there, I tried the SimpleApp WordCount example and it works perfect on local environment. My code: object SimpleApp { def main(args: Array[String]) { val logFile = README.md val conf = new SparkConf() .setMaster(zk://172.31.0.11:2181/mesos) .setAppName(Simple App)

Problem with HBase external table on freshly created EMR cluster

2014-03-05 Thread Philip Limbeck
Hi! I created an EMR cluster with Spark and HBase according to http://aws.amazon.com/articles/4926593393724923 with --hbase flag to include HBase. Although spark and shark both work nicely with the provided S3 examples, there is a problem with external tables pointing to the HBase instance. We

Re: Explain About Logs NetworkWordcount.scala

2014-03-05 Thread eduardocalfaia
Hi TD, I have seen in the web UI the stage number that result has been zero and in the field GC Times there is nothing. http://apache-spark-user-list.1001560.n3.nabble.com/file/n2306/CaptureStage.png -- View this message in context:

Re: Unable to redirect Spark logs to slf4j

2014-03-05 Thread Sergey Parhomenko
Hi Sean, We're not using log4j actually, we're trying to redirect all logging to slf4j which then uses logback as the logging implementation. The fix you mentioned - am I right to assume it is not part of the latest released Spark version (0.9.0)? If so, are there any workarounds or advices on

Re: Spark Worker crashing and Master not seeing recovered worker

2014-03-05 Thread Ognen Duzlevski
Rob, I have seen this too. I have 16 nodes in my spark cluster and for some reason (after app failures) one of the workers will go offline. I will ssh to the machine in question and find that the java process is running but for some reason the master is not noticing this. I have not had the

Re: pyspark and Python virtual enviroments

2014-03-05 Thread Bryn Keller
Hi Christian, The PYSPARK_PYTHON environment variable specifies the python executable to use for pyspark. You can put the path to a virtualenv's python executable and it will work fine. Remember you have to have the same installation at the same path on each of your cluster nodes for pyspark to

Re: disconnected from cluster; reconnecting gives java.net.BindException

2014-03-05 Thread Nicholas Chammas
Whoopdeedoo, after just waiting for like an hour (well, I was doing other stuff) the process holding that address seems to have died automatically and now I can start up pyspark without any warnings. Would there be a faster way to go through this than just wait around for the orphaned process to

Problem with HBase external table on freshly created EMR cluster

2014-03-05 Thread phil3k
Hi! I created an EMR cluster with Spark and HBase according to http://aws.amazon.com/articles/4926593393724923 with --hbase flag to include HBase. Although spark and shark both work nicely with the provided S3 examples, there is a problem with external tables pointing to the HBase instance. We

Re: pyspark and Python virtual enviroments

2014-03-05 Thread Christian
Thanks Bryn. On Wed, Mar 5, 2014 at 9:00 PM, Bryn Keller xol...@xoltar.org wrote: Hi Christian, The PYSPARK_PYTHON environment variable specifies the python executable to use for pyspark. You can put the path to a virtualenv's python executable and it will work fine. Remember you have to

Re: Unable to redirect Spark logs to slf4j

2014-03-05 Thread Sergey Parhomenko
Hi Patrick, Thanks for the patch. I tried building a patched version of spark-core_2.10-0.9.0-incubating.jar but the Maven build fails: *[ERROR] /home/das/Work/thx/incubator-spark/core/src/main/scala/org/apache/spark/Logging.scala:22: object impl is not a member of package org.slf4j* *[ERROR]

Re: PIG to SPARK

2014-03-05 Thread Mayur Rustagi
The real question is why do you want to run pig script using Spark Are you planning to user spark as underlying processing engine for Spark? thats not simple Are you planning to feed Pig data to spark for further processing, then you can write it to HDFS trigger your spark script. rdd.pipe is

Re: trying to understand job cancellation

2014-03-05 Thread Koert Kuipers
i also noticed that jobs (with a new JobGroupId) which i run after this use which use the same RDDs get very confused. i see lots of cancelled stages and retries that go on forever. On Tue, Mar 4, 2014 at 5:02 PM, Koert Kuipers ko...@tresata.com wrote: i have a running job that i cancel while

Re: trying to understand job cancellation

2014-03-05 Thread Mayur Rustagi
How do you cancel the job. Which API do you use? Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi https://twitter.com/mayur_rustagi On Wed, Mar 5, 2014 at 2:29 PM, Koert Kuipers ko...@tresata.com wrote: i also noticed that jobs (with a new JobGroupId) which

Running spark 0.9 on mesos 0.15

2014-03-05 Thread elyast
Hi, Quick question do I need to compile spark against exactly same version of mesos library, currently spark depends on 0.13. The problem I am facing is following I am running MLib example with SVM and it works nicely when I use coarse grained mode, however when running fine grained mode on

Re: trying to understand job cancellation

2014-03-05 Thread Mayur Rustagi
One issue is that job cancellation is posted on eventloop. So its possible that subsequent jobs submitted to job queue may beat the job cancellation event hence the job cancellation event may end up closing them too. So there's definitely a race condition you are risking even if not running into.

Re: trying to understand job cancellation

2014-03-05 Thread Koert Kuipers
got it. seems like i better stay away from this feature for now.. On Wed, Mar 5, 2014 at 5:55 PM, Mayur Rustagi mayur.rust...@gmail.comwrote: One issue is that job cancellation is posted on eventloop. So its possible that subsequent jobs submitted to job queue may beat the job cancellation

Re: Unable to redirect Spark logs to slf4j

2014-03-05 Thread Patrick Wendell
Hey, Maybe I don't understand the slf4j model completely, but I think you need to add a concrete implementation of a logger. So in your case you'd the logback-classic binding in place of the log4j binding at compile time: http://mvnrepository.com/artifact/ch.qos.logback/logback-classic/1.1.1 -