Hello! I am using Spark 0.7.3 with python version. Recently when I run some spark program on a cluster, I found that some processes invoked by spark-0.7.3/python/pyspark/daemon.py would capturing CPU for a long time and consume much memory (e.g., 5g for each process). It seemed that the java process, which was invoked by java -cp :/usr/lib/spark-0.7.3/conf:/usr/lib/spark-0.7.3/core/target/scala-2.9.3/classes ... , was 'competing' with the daemon.py for CPU resources. From my understanding, the java process should be responsible for the 'real' computation in spark. So I am wondering what job the daemon.py will work on? Is it normal for it to consume a lot of CPU and memory? Thanks!
Best, Shangyu Luo -- -- Shangyu, Luo Department of Computer Science Rice University