unsubscribe
Replied Message | From | Wood Super | | Date | 05/01/2024 07:49 | | To | user | | Subject | unsubscribe | unsubscribe
Re: Re: spark job paused(active stages finished)
Thank you for your reply. But,sometimes successed, when i rerun the job. And the job process the same data using the same code. From: Margusja Date: 2017-11-09 14:25 To: bing...@iflytek.com CC: user Subject: Re: spark job paused(active stages finished) You have to deal with failed jobs. In example try catch in your code. Br Margus Roo On 9 Nov 2017, at 05:37, bing...@iflytek.com wrote: Dear,All I have a simple spark job, as below, all tasks in the stage 2(sth failed, retry) already finished. But the next stage never run. driver thread dump: attachment( thread.dump) driver last log: driver do not receive the 16 retry tasks report.Thank you ideas. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Does spark restart the executors if its nodemanager crashes?
hi, guys. We have set up the dynamic allocation resource on spark-yarn. Now we use spark 1.5. One executor tries to fetch data from another nodemanager's shuffle service, and the nodemanager crashes, which makes the executor stop on the states util the crashed nodemanager has been launched again. I just want to know whether spark will resubmit the completed tasks if the latter tasks being executing cannot find the output? Thanks for any explanation. -- Bing Jiang
Package Release Annoucement: Spark SQL on HBase Astro
We are happy to announce the availability of the Spark SQL on HBase 1.0.0 release. http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase The main features in this package, dubbed Astro, include: * Systematic and powerful handling of data pruning and intelligent scan, based on partial evaluation technique * HBase pushdown capabilities like custom filters and coprocessor to support ultra low latency processing * SQL, Data Frame support * More SQL capabilities made possible (Secondary index, bloom filter, Primary Key, Bulk load, Update) * Joins with data from other sources * Python/Java/Scala support * Support latest Spark 1.4.0 release The tests by Huawei team and community contributors covered the areas: bulk load; projection pruning; partition pruning; partial evaluation; code generation; coprocessor; customer filtering; DML; complex filtering on keys and non-keys; Join/union with non-Hbase data; Data Frame; multi-column family test. We will post the test results including performance tests the middle of August. You are very welcomed to try out or deploy the package, and help improve the integration tests with various combinations of the settings, extensive Data Frame tests, complex join/union test and extensive performance tests. Please use the Issues Pull Requests links at this package homepage, if you want to report bugs, improvement or feature requests. Special thanks to project owner and technical leader Yan Zhou, Huawei global team, community contributors and Databricks. Databricks has been providing great assistance from the design to the release. Astro, the Spark SQL on HBase package will be useful for ultra low latency query and analytics of large scale data sets in vertical enterprises. We will continue to work with the community to develop new features and improve code base. Your comments and suggestions are greatly appreciated. Yan Zhou / Bing Xiao Huawei Big Data team
fail to run LBFS in 5G KDD data in spark 1.0.1?
1 I don't use spark_submit to run my problem and use spark context directly val conf = new SparkConf() .setMaster(spark://123d101suse11sp3:7077) .setAppName(LBFGS) .set(spark.executor.memory, 30g) .set(spark.akka.frameSize,20) val sc = new SparkContext(conf) 2 I use KDD data, size is about 5G 3 After I execute LBFGS.runLBFGS, at the stage of 7, the problem occus: [cid:image001.png@01CFB197.A3BD3D60] 14/08/06 16:44:45 INFO DAGScheduler: Failed to run aggregate at LBFGS.scala:201 Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 7.0:12 failed 4 times, most recent failure: TID 304 on host 123d103suse11sp3 failed for unknown reason Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1044) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1028) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1026) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1026) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:634) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:634) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:634) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1229) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
答复: fail to run LBFS in 5G KDD data in spark 1.0.1?
I have test it in spark-1.1.0-SNAPSHOT. It is ok now 发件人: Xiangrui Meng [mailto:men...@gmail.com] 发送时间: 2014年8月6日 23:12 收件人: Lizhengbing (bing, BIPA) 抄送: user@spark.apache.org 主题: Re: fail to run LBFS in 5G KDD data in spark 1.0.1? Do you mind testing 1.1-SNAPSHOT and allocating more memory to the driver? I think the problem is with the feature dimension. KDD data has more than 20M features and in v1.0.1, the driver collects the partial gradients one by one, sums them up, does the update, and then sends the new weights back to executors one by one. In 1.1-SNAPSHOT, we switched to multi-level tree aggregation and torrent broadcasting. For the driver memory, you can set it with spark-summit using `--driver-memory 30g`. It could be confirmed by visiting the storage tab in the WebUI. -Xiangrui On Wed, Aug 6, 2014 at 1:58 AM, Lizhengbing (bing, BIPA) zhengbing...@huawei.commailto:zhengbing...@huawei.com wrote: 1 I don’t use spark_submit to run my problem and use spark context directly val conf = new SparkConf() .setMaster(spark://123d101suse11sp3:7077) .setAppName(LBFGS) .set(spark.executor.memory, 30g) .set(spark.akka.frameSize,20) val sc = new SparkContext(conf) 2 I use KDD data, size is about 5G 3 After I execute LBFGS.runLBFGS, at the stage of 7, the problem occus: [cid:image001.png@01CFB234.3AA725F0] 14/08/06 16:44:45 INFO DAGScheduler: Failed to run aggregate at LBFGS.scala:201 Exception in thread main org.apache.spark.SparkException: Job aborted due to stage failure: Task 7.0:12 failed 4 times, most recent failure: TID 304 on host 123d103suse11sp3 failed for unknown reason Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.orghttp://org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1044) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1028) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1026) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1026) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:634) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:634) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:634) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1229) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
How can I integrate spark cluster into my own program without using spark-submit?
I want to use spark cluster through a scala function. So I can integrate spark into my program directly. For example: When I call count function in my own program, my program will deploy the function to the cluster , so I can get the result directly def count()= { val master = spark://mache123:7077 val appName = control_test val sc = new SparkContext(master, appName) val rdd = sc.textFile(hdfs://123d101suse11sp3:9000/netflix/netflix.test) val count = rdd.count System.out.println(rdd.count = + count) count }
答复: Spark RDD Disk Persistance
You might let your data stored in tachyon 发件人: Jahagirdar, Madhu [mailto:madhu.jahagir...@philips.com] 发送时间: 2014年7月8日 10:16 收件人: user@spark.apache.org 主题: Spark RDD Disk Persistance Should i use Disk based Persistance for RDD's and if the machine goes down during the program execution, next time when i rerun the program would the data be intact and not lost ? Regards, Madhu Jahagirdar The information contained in this message may be confidential and legally protected under applicable law. The message is intended solely for the addressee(s). If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and destroy all copies of the original message.