A question about accumulator

2015-11-10 Thread Tan Tim
Hi, all There is a discussion about the accumulator in stack overflow: http://stackoverflow.com/questions/27357440/spark-accumalator-value-is-different-when-inside-rdd-and-outside-rdd I comment about this question (from user Tim). As the output I tried, I hava two questions: 1. Why the

how to track the jobs status without the webUI

2014-09-18 Thread Tan Tim
Hi, all, I can see the job failed from the web UI, But when I run ps on the client(which machine I submit the job), I can find the proces is still exists: user_tt 5971 2.6 2.2 15030180 3029840 ?Sl 11:41 4:37 java -cp

Re: Spark run slow after unexpected repartition

2014-09-18 Thread Tan Tim
I also encountered the similar problem: after some stages, all the taskes are assigned to one machine, and the stage execution get slower and slower. *[the spark conf setting]* val conf = new SparkConf().setMaster(sparkMaster).setAppName(ModelTraining

Re: why a machine learning application run slowly on the spark cluster

2014-07-30 Thread Tan Tim
? -Xiangrui On Tue, Jul 29, 2014 at 10:46 PM, Tan Tim unname...@gmail.com wrote: The application is Logistic Regression (OWLQN), we develop a sparse vector version. The feature dimesions is 1M+, but its very sparse. This appliction can run on another spark cluster, and every stage is about 50