Re: Compile SimpleApp.scala encountered error, please can any one help?

2014-04-12 Thread prabeesh k
ensure the only one SimpleApp object in your project, also check is there any copy of SimpleApp.scala. Normally the file SimpleApp.scala in src/main/scala or in the project root folder. On Sat, Apr 12, 2014 at 11:07 AM, jni2000 james...@federatedwireless.comwrote: Hi I am a new Spark user

cannot exec. job: TaskSchedulerImpl: Initial job has not accepted any resources

2014-04-12 Thread Gerd Koenig
Hi, I'm starting using Spark and have installed Spark within CDH5 using ClouderaManager. I set up one master (hadoop-pg-5) and 3 workers (hadoop-pg-7[-8,-9]). Master WebUI looks good, all workers seem to be registered. If I open spark-shell and try to execute the wordcount example, the execution

cannot exec. job: TaskSchedulerImpl: Initial job has not accepted any resources

2014-04-12 Thread ge ko
Hi, I'm starting using Spark and have installed Spark within CDH5 using ClouderaManager. I set up one master (hadoop-pg-5) and 3 workers (hadoop-pg-7[-8,-9]). Master WebUI looks good, all workers seem to be registered. If I open spark-shell and try to execute the wordcount example, the execution

Re: Huge matrix

2014-04-12 Thread Xiaoli Li
Hi Reza, Thank you for your information. I will try it. On Fri, Apr 11, 2014 at 11:21 PM, Reza Zadeh r...@databricks.com wrote: Hi Xiaoli, There is a PR currently in progress to allow this, via the sampling scheme described in this paper: stanford.edu/~rezab/papers/dimsum.pdf The PR is

Re: Huge matrix

2014-04-12 Thread Guillaume Pitel
Hi, I'm doing this here for multiple tens of millions of elements (and the goal is to reach multiple billions), on a relatively small cluster (7 nodes 4 cores 32GB RAM). We use multiprobe KLSH. All you have to do is run a Kmeans on your data, then compute the

Re: Compile SimpleApp.scala encountered error, please can any one help?

2014-04-12 Thread jni2000
Thanks, Prabeesh. I figured it out. The java file did conflict with the scala file. Thanks for the hint. Jmaes -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Compile-SimpleApp-scala-encountered-error-please-can-any-one-help-tp4160p4168.html Sent from the

Re: Compile SimpleApp.scala encountered error, please can any one help?

2014-04-12 Thread jni2000
prabeesh Thanks for the reply. By one copy of SimpleApp.scala, do you mean one copy of this .scala file? I only have one in a newly create test project. I do have one copy of SimpleApp.java but in a different directory (src/main/java), .scala file is in src/main/scala directory. Will java and

Re: Changing number of workers for benchmarking purposes

2014-04-12 Thread Kalpit Shah
In spark release 0.7.1, I added support for running multiple worker processes on a single slave machine. I built it for performance testing multiple workers on a single machine in standalone mode. Set the following in conf/spark-env.sh and bounce your cluster : export SPARK_WORKER_INSTANCES=3

Re: Huge matrix

2014-04-12 Thread Xiaoli Li
Hi Guillaume, This sounds a good idea to me. I am a newbie here. Could you further explain how will you determine which clusters to keep? According to the distance between each element with each cluster center? Will you keep several clusters for each element for searching nearest neighbours?

Re: Huge matrix

2014-04-12 Thread Tom V
The last writer is suggesting using the triangle inequality to cut down the search space. If c is the centroid of cluster C, then the closest any point in C is to x is ||x-c|| - r(C), where r(C) is the (precomputed) radius of the cluster---the distance of the farthest point in C to c. Whether

Re: Master registers itself at startup?

2014-04-12 Thread Mark Baker
On Sat, Apr 12, 2014 at 9:19 AM, ge ko koenig@gmail.com wrote: Hi, I'm wondering why the master is registering itself at startup, exactly 3 times (same number as the number of workers). Log excerpt: 2014-04-11 21:08:15,363 INFO akka.event.slf4j.Slf4jLogger: Slf4jLogger started

Re: Executing spark jobs with predefined Hadoop user

2014-04-12 Thread Asaf Lahav
Thank you all very much for your responses We are going to test these recommendations. Adnan, in regards to the HDFS URI, this is actually the manner in which we are accessing the file system already. It was simply removed from the post. Thank you, Asaf On Thu, Apr 10, 2014 at 5:33 PM,