from:"Shangyu Luo"

Re: Biggest spark.akka.framesize possible

2013-12-08 Thread Shangyu Luo

due to the frame size being too small, you’re fine. Having a bigger frame size will result in wasted space and unneeded memory allocation for buffers. It doesn’t make the communication more efficient. Matei On Dec 8, 2013, at 12:57 PM, Shangyu Luo lsy...@gmail.com wrote: I would like

Re: cluster hangs for no apparent reason

2013-11-07 Thread Shangyu Luo

walrusthe...@gmail.com Shangyu, Thanks for the tip re: the flag! Maybe the broadcast variable is only for complex data structures? On Sun, Nov 3, 2013 at 7:58 PM, Shangyu Luo lsy...@gmail.com wrote: I met the problem of 'Too many open files' before. One solution is adding 'ulimit -n 10

Re: cluster hangs for no apparent reason

2013-11-03 Thread Shangyu Luo

}.reduce(_ + _) w -= gradient } On Sun, Nov 3, 2013 at 10:47 AM, Shangyu Luo lsy...@gmail.com wrote: Hi Walrus, Thank you for sharing your solution to your problem. I think I have met the similar problem before (i.e., one machine is working while others are idle.) and I

Re: Questions about the files that Spark will produce during its running

2013-10-29 Thread Shangyu Luo

The error is from a worker node -- did you check that /data2 is set up properly on the worker nodes too? In general that should be the only directory used. Matei On Oct 28, 2013, at 6:52 PM, Shangyu Luo lsy...@gmail.com wrote: Hello, I have some questions about the files that Spark will create

Questions about the files that Spark will produce during its running

2013-10-28 Thread Shangyu Luo

Hello, I have some questions about the files that Spark will create and use during its running. (1) I am running a python program on Spark with a cluster of EC2. The data comes from hdfs file system. I have met the following error in the console of the master node: *java.io.FileNotFoundException:

Running pi example error with spark 0.8.0 cdh4 version

2013-10-10 Thread Shangyu Luo

set up SCALA_HOME and SPARK_HOME in bashrc and they worked well for spark 0.8.0 source version (I downloaded and compiled the source version before, but I have deleted it now) So what's going wrong here? Any advice will be appreciated. Thanks! -- -- Shangyu, Luo Department of Computer Science Rice

Re: Running pi example error with spark 0.8.0 cdh4 version

2013-10-10 Thread Shangyu Luo

OK. I think I have solved it. I do not need to build the 0.8.0 cdh4 version because it has been prebuilt. Now the pi example can run now. 2013/10/10 Shangyu Luo lsy...@gmail.com Hello, I downloaded spark 0.8.0 cdh4 version, built and compiled it by using SPARK_HADOOP_VERSION=2.0.0-cdh4.4.0

Re: The functionality of daemon.py?

2013-10-08 Thread Shangyu Luo

is the actual PySpark worker process, and is launched by the Spark worker when running Python jobs. So, when using PySpark, the real computation is handled by a python process (via daemon.py), not a java process. Hope that helps, -Jey On Mon, Oct 7, 2013 at 9:50 PM, Shangyu Luo lsy...@gmail.com wrote

Re: The functionality of daemon.py?

2013-10-08 Thread Shangyu Luo

Also, I found that the 'daemon.py' will continue running on one worker node even after I terminated the spark job at master node. A little strange for me. 2013/10/8 Shangyu Luo lsy...@gmail.com Hello Jey, Thank you for answering. I have found that there are about 6 or 7 'daemon.py' processes

The functionality of daemon.py?

2013-10-07 Thread Shangyu Luo

job the daemon.py will work on? Is it normal for it to consume a lot of CPU and memory? Thanks! Best, Shangyu Luo -- -- Shangyu, Luo Department of Computer Science Rice University

Re: Some questions about task distribution and execution in Spark

2013-10-06 Thread Shangyu Luo

is the default unless you give it another value. You can view the exact number of tasks on the job monitoring UI in Spark 0.8 ( http://spark.incubator.apache.org/docs/latest/monitoring.html). Matei Any help will be appreciated. Thanks! -- -- Shangyu, Luo Department of Computer Science

Re: How to prevent webUI from coming up

2013-10-04 Thread Shangyu Luo

and the time he shows up? -Randy Pausch -- -- Shangyu, Luo Department of Computer Science Rice University -- Not Just Think About It, But Do It! -- Success is never final. -- Losers always whine about their best

Re: Wrong result with mapPartitions example

2013-09-28 Thread Shangyu Luo

, took 0.172441 s [625, 625, 625, 625, 625, 625, 625, 625, 625, 625, 625, 625, 625, 625, 625, 625] -- Reynold Xin, AMPLab, UC Berkeley http://rxin.org On Thu, Sep 26, 2013 at 10:08 PM, Shangyu Luo lsy...@gmail.com wrote: I can see the test for ParallelCollectionRDD.slice(). But how

Some questions about task distribution and execution in Spark

2013-09-26 Thread Shangyu Luo

be counted as one task? For example, sc.parallelize([0,1,2,3]).map(lambda x: x) Will there be four tasks? Any help will be appreciated. Thanks! -- -- Shangyu, Luo Department of Computer Science Rice University

Wrong result with mapPartitions example

2013-09-26 Thread Shangyu Luo

wrong with my code? Thanks! -- -- Shangyu, Luo Department of Computer Science Rice University

Re: Biggest spark.akka.framesize possible

Re: cluster hangs for no apparent reason

Re: cluster hangs for no apparent reason

Re: Questions about the files that Spark will produce during its running

Questions about the files that Spark will produce during its running

Running pi example error with spark 0.8.0 cdh4 version

Re: Running pi example error with spark 0.8.0 cdh4 version

Re: The functionality of daemon.py?

Re: The functionality of daemon.py?

The functionality of daemon.py?

Re: Some questions about task distribution and execution in Spark

Re: How to prevent webUI from coming up

Re: Wrong result with mapPartitions example

Some questions about task distribution and execution in Spark

Wrong result with mapPartitions example

15 matches

Site Navigation

Mail list logo

Footer information