Thanks ,you advise is usefull I just submit my job on my spark client which config with simple configure file so it failed when i run my job on service machine everything is okay
On Fri, Mar 06, 2015 at 02:10:04PM +0530, Akhil Das wrote: > Looks like an issue with your yarn setup, could you try doing a simple > example with spark-shell? > > Start the spark shell as: > > $*MASTER=yarn-client bin/spark-shell* > *spark-shell> *sc.parallelize(1 to 1000).collect > > If that doesn't work, then make sure your yarn services are up and running > and in your spark-env.sh you may set the corresponding configurations from > the following: > > > # Options read in YARN client mode > # - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files > # - SPARK_EXECUTOR_INSTANCES, Number of workers to start (Default: 2) > # - SPARK_EXECUTOR_CORES, Number of cores for the workers (Default: 1). > # - SPARK_EXECUTOR_MEMORY, Memory per Worker (e.g. 1000M, 2G) (Default: 1G) > # - SPARK_DRIVER_MEMORY, Memory for Master (e.g. 1000M, 2G) (Default: 512 > Mb) > > > Thanks > Best Regards > > On Fri, Mar 6, 2015 at 1:09 PM, fenghaixiong <980548...@qq.com> wrote: > > > Hi all, > > > > > > I'm try to write a spark stream programme so i read the spark > > online document ,according the document i write the programe like this : > > > > import org.apache.spark.SparkConf > > import org.apache.spark.streaming._ > > import org.apache.spark.streaming.StreamingContext._ > > > > object SparkStreamTest { > > def main(args: Array[String]) { > > val conf = new SparkConf() > > val ssc = new StreamingContext(conf, Seconds(1)) > > val lines = ssc.socketTextStream(args(0), args(1).toInt) > > val words = lines.flatMap(_.split(" ")) > > val pairs = words.map(word => (word, 1)) > > val wordCounts = pairs.reduceByKey(_ + _) > > wordCounts.print() > > ssc.start() // Start the computation > > ssc.awaitTermination() // Wait for the computation to terminate > > } > > > > } > > > > > > > > for test i first start listen a port by this: > > nc -lk 9999 > > > > and then i submit job by > > spark-submit --master local[2] --class com.nd.hxf.SparkStreamTest > > spark-test-tream-1.0-SNAPSHOT-job.jar localhost 9999 > > > > everything is okay > > > > > > but when i run it on yarn by this : > > spark-submit --master yarn-client --class com.nd.hxf.SparkStreamTest > > spark-test-tream-1.0-SNAPSHOT-job.jar localhost 9999 > > > > it wait for a longtime and repeat output somemessage a apart of the output > > is like this: > > > > > > > > > > > > > > > > 15/03/06 15:30:24 INFO YarnClientSchedulerBackend: SchedulerBackend is > > ready for scheduling beginning after waiting > > maxRegisteredResourcesWaitingTime: 30000(ms) > > 15/03/06 15:30:24 INFO ReceiverTracker: ReceiverTracker started > > 15/03/06 15:30:24 INFO ForEachDStream: metadataCleanupDelay = -1 > > 15/03/06 15:30:24 INFO ShuffledDStream: metadataCleanupDelay = -1 > > 15/03/06 15:30:24 INFO MappedDStream: metadataCleanupDelay = -1 > > 15/03/06 15:30:24 INFO FlatMappedDStream: metadataCleanupDelay = -1 > > 15/03/06 15:30:24 INFO SocketInputDStream: metadataCleanupDelay = -1 > > 15/03/06 15:30:24 INFO SocketInputDStream: Slide time = 1000 ms > > 15/03/06 15:30:24 INFO SocketInputDStream: Storage level = > > StorageLevel(false, false, false, false, 1) > > 15/03/06 15:30:24 INFO SocketInputDStream: Checkpoint interval = null > > 15/03/06 15:30:24 INFO SocketInputDStream: Remember duration = 1000 ms > > 15/03/06 15:30:24 INFO SocketInputDStream: Initialized and validated > > org.apache.spark.streaming.dstream.SocketInputDStream@b01c5f8 > > 15/03/06 15:30:24 INFO FlatMappedDStream: Slide time = 1000 ms > > 15/03/06 15:30:24 INFO FlatMappedDStream: Storage level = > > StorageLevel(false, false, false, false, 1) > > 15/03/06 15:30:24 INFO FlatMappedDStream: Checkpoint interval = null > > 15/03/06 15:30:24 INFO FlatMappedDStream: Remember duration = 1000 ms > > 15/03/06 15:30:24 INFO FlatMappedDStream: Initialized and validated > > org.apache.spark.streaming.dstream.FlatMappedDStream@6bd47453 > > 15/03/06 15:30:24 INFO MappedDStream: Slide time = 1000 ms > > 15/03/06 15:30:24 INFO MappedDStream: Storage level = StorageLevel(false, > > false, false, false, 1) > > 15/03/06 15:30:24 INFO MappedDStream: Checkpoint interval = null > > 15/03/06 15:30:24 INFO MappedDStream: Remember duration = 1000 ms > > 15/03/06 15:30:24 INFO MappedDStream: Initialized and validated > > org.apache.spark.streaming.dstream.MappedDStream@941451f > > 15/03/06 15:30:24 INFO ShuffledDStream: Slide time = 1000 ms > > 15/03/06 15:30:24 INFO ShuffledDStream: Storage level = > > StorageLevel(false, false, false, false, 1) > > 15/03/06 15:30:24 INFO ShuffledDStream: Checkpoint interval = null > > 15/03/06 15:30:24 INFO ShuffledDStream: Remember duration = 1000 ms > > 15/03/06 15:30:24 INFO ShuffledDStream: Initialized and validated > > org.apache.spark.streaming.dstream.ShuffledDStream@42eba6ee > > 15/03/06 15:30:24 INFO ForEachDStream: Slide time = 1000 ms > > 15/03/06 15:30:24 INFO ForEachDStream: Storage level = StorageLevel(false, > > false, false, false, 1) > > 15/03/06 15:30:24 INFO ForEachDStream: Checkpoint interval = null > > 15/03/06 15:30:24 INFO ForEachDStream: Remember duration = 1000 ms > > 15/03/06 15:30:24 INFO ForEachDStream: Initialized and validated > > org.apache.spark.streaming.dstream.ForEachDStream@48d166b5 > > 15/03/06 15:30:24 INFO SparkContext: Starting job: start at > > SparkStreamTest.scala:21 > > 15/03/06 15:30:24 INFO RecurringTimer: Started timer for JobGenerator at > > time 1425627025000 > > 15/03/06 15:30:24 INFO JobGenerator: Started JobGenerator at 1425627025000 > > ms > > 15/03/06 15:30:24 INFO JobScheduler: Started JobScheduler > > 15/03/06 15:30:24 INFO DAGScheduler: Registering RDD 2 (start at > > SparkStreamTest.scala:21) > > 15/03/06 15:30:24 INFO DAGScheduler: Got job 0 (start at > > SparkStreamTest.scala:21) with 20 output partitions (allowLocal=false) > > 15/03/06 15:30:24 INFO DAGScheduler: Final stage: Stage 0(start at > > SparkStreamTest.scala:21) > > 15/03/06 15:30:24 INFO DAGScheduler: Parents of final stage: List(Stage 1) > > 15/03/06 15:30:24 INFO DAGScheduler: Missing parents: List(Stage 1) > > 15/03/06 15:30:24 INFO DAGScheduler: Submitting Stage 1 (MappedRDD[2] at > > start at SparkStreamTest.scala:21), which has no missing parents > > 15/03/06 15:30:24 INFO MemoryStore: ensureFreeSpace(2720) called with > > curMem=0, maxMem=277842493 > > 15/03/06 15:30:24 INFO MemoryStore: Block broadcast_0 stored as values in > > memory (estimated size 2.7 KB, free 265.0 MB) > > 15/03/06 15:30:24 INFO MemoryStore: ensureFreeSpace(1594) called with > > curMem=2720, maxMem=277842493 > > 15/03/06 15:30:24 INFO MemoryStore: Block broadcast_0_piece0 stored as > > bytes in memory (estimated size 1594.0 B, free 265.0 MB) > > 15/03/06 15:30:24 INFO BlockManagerInfo: Added broadcast_0_piece0 in > > memory on 192.168.124.1:57216 (size: 1594.0 B, free: 265.0 MB) > > 15/03/06 15:30:24 INFO BlockManagerMaster: Updated info of block > > broadcast_0_piece0 > > 15/03/06 15:30:24 INFO DAGScheduler: Submitting 50 missing tasks from > > Stage 1 (MappedRDD[2] at start at SparkStreamTest.scala:21) > > 15/03/06 15:30:24 INFO YarnClientClusterScheduler: Adding task set 1.0 > > with 50 tasks > > 15/03/06 15:30:25 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:25 INFO JobScheduler: Added jobs for time 1425627025000 ms > > 15/03/06 15:30:25 INFO JobScheduler: Starting job streaming job > > 1425627025000 ms.0 from job set of time 1425627025000 ms > > 15/03/06 15:30:25 INFO SparkContext: Starting job: getCallSite at > > DStream.scala:294 > > 15/03/06 15:30:25 INFO DAGScheduler: Registering RDD 6 (map at > > SparkStreamTest.scala:18) > > 15/03/06 15:30:25 INFO DAGScheduler: Got job 1 (getCallSite at > > DStream.scala:294) with 1 output partitions (allowLocal=true) > > 15/03/06 15:30:25 INFO DAGScheduler: Final stage: Stage 2(getCallSite at > > DStream.scala:294) > > 15/03/06 15:30:25 INFO DAGScheduler: Parents of final stage: List(Stage 3) > > 15/03/06 15:30:25 INFO DAGScheduler: Missing parents: List() > > 15/03/06 15:30:25 INFO DAGScheduler: Submitting Stage 2 (ShuffledRDD[7] at > > reduceByKey at SparkStreamTest.scala:19), which has no missing parents > > 15/03/06 15:30:25 INFO MemoryStore: ensureFreeSpace(2136) called with > > curMem=4314, maxMem=277842493 > > 15/03/06 15:30:25 INFO MemoryStore: Block broadcast_1 stored as values in > > memory (estimated size 2.1 KB, free 265.0 MB) > > 15/03/06 15:30:25 INFO MemoryStore: ensureFreeSpace(1333) called with > > curMem=6450, maxMem=277842493 > > 15/03/06 15:30:25 INFO MemoryStore: Block broadcast_1_piece0 stored as > > bytes in memory (estimated size 1333.0 B, free 265.0 MB) > > 15/03/06 15:30:25 INFO BlockManagerInfo: Added broadcast_1_piece0 in > > memory on 192.168.124.1:57216 (size: 1333.0 B, free: 265.0 MB) > > 15/03/06 15:30:25 INFO BlockManagerMaster: Updated info of block > > broadcast_1_piece0 > > 15/03/06 15:30:25 INFO DAGScheduler: Submitting 1 missing tasks from Stage > > 2 (ShuffledRDD[7] at reduceByKey at SparkStreamTest.scala:19) > > 15/03/06 15:30:25 INFO YarnClientClusterScheduler: Adding task set 2.0 > > with 1 tasks > > 15/03/06 15:30:26 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:26 INFO JobScheduler: Added jobs for time 1425627026000 ms > > 15/03/06 15:30:27 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:27 INFO JobScheduler: Added jobs for time 1425627027000 ms > > 15/03/06 15:30:28 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:28 INFO JobScheduler: Added jobs for time 1425627028000 ms > > 15/03/06 15:30:29 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:29 INFO JobScheduler: Added jobs for time 1425627029000 ms > > 15/03/06 15:30:30 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:30 INFO JobScheduler: Added jobs for time 1425627030000 ms > > 15/03/06 15:30:31 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:31 INFO JobScheduler: Added jobs for time 1425627031000 ms > > 15/03/06 15:30:32 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:32 INFO JobScheduler: Added jobs for time 1425627032000 ms > > 15/03/06 15:30:33 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:33 INFO JobScheduler: Added jobs for time 1425627033000 ms > > 15/03/06 15:30:34 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:34 INFO JobScheduler: Added jobs for time 1425627034000 ms > > 15/03/06 15:30:35 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:35 INFO JobScheduler: Added jobs for time 1425627035000 ms > > 15/03/06 15:30:36 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:36 INFO JobScheduler: Added jobs for time 1425627036000 ms > > 15/03/06 15:30:37 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:37 INFO JobScheduler: Added jobs for time 1425627037000 ms > > 15/03/06 15:30:38 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:38 INFO JobScheduler: Added jobs for time 1425627038000 ms > > 15/03/06 15:30:39 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:39 INFO JobScheduler: Added jobs for time 1425627039000 ms > > 15/03/06 15:30:39 WARN YarnClientClusterScheduler: Initial job has not > > accepted any resources; check your cluster UI to ensure that workers are > > registered and have sufficient memory > > 15/03/06 15:30:40 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:40 INFO JobScheduler: Added jobs for time 1425627040000 ms > > 15/03/06 15:30:41 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:41 INFO JobScheduler: Added jobs for time 1425627041000 ms > > 15/03/06 15:30:42 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:42 INFO JobScheduler: Added jobs for time 1425627042000 ms > > 15/03/06 15:30:43 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:43 INFO JobScheduler: Added jobs for time 1425627043000 ms > > 15/03/06 15:30:44 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:44 INFO JobScheduler: Added jobs for time 1425627044000 ms > > 15/03/06 15:30:45 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:45 INFO JobScheduler: Added jobs for time 1425627045000 ms > > 15/03/06 15:30:46 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:46 INFO JobScheduler: Added jobs for time 1425627046000 ms > > 15/03/06 15:30:47 INFO ReceiverTracker: Stream 0 received 0 blocks > > 15/03/06 15:30:47 INFO JobScheduler: Added jobs for time 1425627047000 ms > > > > > > > > thanks for any help > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org