Spark metrics source

2015-04-20 Thread Udit Mehta
Hi, I am running spark 1.3 on yarn and am trying to publish some metrics from my app. I see that we need to use the codahale library to create a source and then specify the source in the metrics.properties. Does somebody have a sample metrics source which I can use in my app to forward the

Re: Configuring logging properties for executor

2015-04-20 Thread Michael Ryabtsev
Oh, you are right, thanks. On Mon, Apr 20, 2015 at 6:31 PM, Lan Jiang ljia...@gmail.com wrote: Each application gets its own executor processes, so there should be no problem running them in parallel. Lan On Apr 20, 2015, at 10:25 AM, Michael Ryabtsev michael...@gmail.com wrote: Hi

sparksql - HiveConf not found during task deserialization

2015-04-20 Thread Manku Timma
I am using spark-1.3 with hadoop-provided and hive-provided and hive-0.13.1 profiles. I am running a simple spark job on a yarn cluster by adding all hadoop2 and hive13 jars to the spark classpaths. If I remove the hive-provided while building spark, I dont face any issue. But with hive-provided

How to run spark programs in eclipse like mapreduce

2015-04-20 Thread sandeep vura
Hi Sparkers, I have written a code in python in eclipse now that code should execute in spark cluster like mapreduce jobs in hadoop cluster.Can anyone please help me with instructions. Regards, Sandeep.v

Re: How to run spark programs in eclipse like mapreduce

2015-04-20 Thread ๏̯͡๏
I just do Run as Applicaton/Debug As Application on main program. On Mon, Apr 20, 2015 at 12:14 PM, sandeep vura sandeepv...@gmail.com wrote: Hi Sparkers, I have written a code in python in eclipse now that code should execute in spark cluster like mapreduce jobs in hadoop cluster.Can

Re: How to run spark programs in eclipse like mapreduce

2015-04-20 Thread Akhil Das
Why not build the project and submit the build jar with Spark submit? If you want to run it within eclipse, then all you have to do is, create a SparkContext pointing to your cluster, do a sc.addJar(/path/to/your/project/jar) and then you can hit the run button to run the job (note that network

Re: sparksql - HiveConf not found during task deserialization

2015-04-20 Thread Akhil Das
Looks like a missing jar, try to print the classpath and make sure the hive jar is present. Thanks Best Regards On Mon, Apr 20, 2015 at 11:52 AM, Manku Timma manku.tim...@gmail.com wrote: I am using spark-1.3 with hadoop-provided and hive-provided and hive-0.13.1 profiles. I am running a

Re: SparkStreaming onStart not being invoked on CustomReceiver attached to master with multiple workers

2015-04-20 Thread Akhil Das
Would be good, if you can paste your custom receiver code and the code that you used to invoke it. Thanks Best Regards On Mon, Apr 20, 2015 at 9:43 AM, Ankit Patel patel7...@hotmail.com wrote: I am experiencing problem with SparkStreaming (Spark 1.2.0), the onStart method is never called on

Running spark over HDFS

2015-04-20 Thread madhvi
Hi All, I am new to spark and have installed spark cluster over my system having hadoop cluster.I want to process data stored in HDFS through spark. When I am running code in eclipse it is giving the following warning repeatedly: scheduler.TaskSchedulerImpl: Initial job has not accepted any

Re: MLlib -Collaborative Filtering

2015-04-20 Thread Nick Pentreath
You will have to get the two user factor vectors from the ALS model and compute the cosine similarity between them. You can do this using Breeze vectors: import breeze.linalg._ val user1 = new DenseVector[Double](userFactors.lookup(user1).head) val user2 = new

Re: sparksql - HiveConf not found during task deserialization

2015-04-20 Thread Manku Timma
Akhil, But the first case of creating HiveConf on the executor works fine (map case). Only the second case fails. I was suspecting some foul play with classloaders. On 20 April 2015 at 12:20, Akhil Das ak...@sigmoidanalytics.com wrote: Looks like a missing jar, try to print the classpath and

Re: shuffle.FetchFailedException in spark on YARN job

2015-04-20 Thread Akhil Das
Which version of Spark are you using? Did you try using spark.shuffle.blockTransferService=nio Thanks Best Regards On Sat, Apr 18, 2015 at 11:14 PM, roy rp...@njit.edu wrote: Hi, My spark job is failing with following error message org.apache.spark.shuffle.FetchFailedException:

NEWBIE/not able to connect to postgresql using jdbc

2015-04-20 Thread shashanksoni
I am using spark 1.3 standalone cluster on my local windows and trying to load data from one of our server. Below is my code - import os os.environ['SPARK_CLASSPATH'] = C:\Users\ACERNEW3\Desktop\Spark\spark-1.3.0-bin-hadoop2.4\postgresql-9.2-1002.jdbc3.jar from pyspark import SparkContext,

Re: Running spark over HDFS

2015-04-20 Thread Akhil Das
In your eclipse, while you create your SparkContext, set the master uri as shown in the web UI's top left corner like: spark://someIPorHost:7077 and it should be fine. Thanks Best Regards On Mon, Apr 20, 2015 at 12:22 PM, madhvi madhvi.gu...@orkash.com wrote: Hi All, I am new to spark and

<    1   2