回复:Driver hung and happend out of memory while writing to console progress bar

2017-02-09 Thread John Fang
the spark version is 2.1.0 --发件人:方孝健(玄弟) 发送时间:2017年2月10日(星期五) 12:35收件人:spark-dev ; spark-user 主 题:Driver hung and happend out of memory while writing to

Driver hung and happend out of memory while writing to console progress bar

2017-02-09 Thread John Fang
[Stage 172:==> (10328 + 93) / 16144] [Stage 172:==> (10329 + 93) / 16144] [Stage 172:==> (10330 + 93) / 16144] [Stage 172:==>

spark main thread quit, but the driver don't crash at standalone cluster

2017-01-17 Thread John Fang
My spark main thread create some daemon threads which maybe timer thread. Then the spark application throw some exceptions, and the main thread will quit. But the jvm of driver don't crash for standalone cluster. Of course the question don't happen at yarn cluster. Because the application

spark main thread quit, but the Jvm of driver don't crash

2017-01-17 Thread John Fang
My spark main thread create some daemon thread. Then the spark application throw some exceptions, and the main thread will quit. But the jvm of driver don't crash, so How can i do? for example: val sparkConf = new SparkConf().setAppName("NetworkWordCount")

how can I get the application belong to the driver?

2016-12-26 Thread John Fang
I hope I can get the application by the driverId, but I don't find the rest api at spark。Then how can i get the application, which belong to one driver。

can we unite the UI among different standaone clusters' UI?

2016-12-14 Thread John Fang
As we know, each standaone cluster has itself UI. Then we will have more than one UI if we have many standalone cluster. How can I only have a UI which can access different standaone clusters?

how can I set the log configuration file for spark history server ?

2016-12-08 Thread John Fang
./start-history-server.sh starting org.apache.spark.deploy.history.HistoryServer, logging to  /home/admin/koala/data/versions/0/SPARK/2.0.2/spark-2.0.2-bin-hadoop2.6/logs/spark-admin-org.apache.spark.deploy.history.HistoryServer-1-v069166214.sqa.zmf.out Then the history will print all log to the

Question about the DirectKafkaInputDStream

2016-12-08 Thread John Fang
The source is DirectKafkaInputDStream which can ensure the exectly-once of the  consumer side. But I have a question based the following code。As we known, the  "graph.generateJobs(time)" will create rdds and generate jobs。And the source  RDD is KafkaRDD which contain the offsetRange。 The jobs are 

Can spark support exactly once based kafka ? Due to these following question?

2016-12-04 Thread John Fang
1. If a task complete the operation, it will notify driver. The driver may not receive the message due to the network, and think the task is still running. Then the child stage won't be scheduled ? 2. how do spark guarantee the downstream-task can receive the shuffle-data completely. As fact, I