Hi Deng,

I tried the same code again.

It seemed that when launching application via yarn on single node, 
JavaDStream.print() did not work. However, occasionally it worked.

If launch the same application in local mode, it always worked.


The code is as below,

        SparkConf conf = new SparkConf().setAppName("Monitor&Control");
        JavaStreamingContext jssc = new JavaStreamingContext(conf, 
Durations.seconds(1));
        JavaReceiverInputDStream<String> inputDS = MQTTUtils.createStream(jssc, 
"tcp://114.55.145.185:1883", "Control");
        inputDS.print();
        jssc.start();
        jssc.awaitTermination();


Command for launching via yarn, (did not work)

spark-submit --master yarn --deploy-mode cluster --driver-memory 4g 
--executor-memory 2g target/CollAna-1.0-SNAPSHOT.jar
 Command for launching via local mode (works)
   spark-submit --master local[4] --driver-memory 4g --executor-memory 2g 
--num-executors 4 target/CollAna-1.0-SNAPSHOT.jar



Any advice?


Thanks,

Jared


________________________________
From: Yu Wei <yu20...@hotmail.com>
Sent: Tuesday, July 5, 2016 4:41 PM
To: Deng Ching-Mallete
Cc: user@spark.apache.org
Subject: Re: Is that possible to launch spark streaming application on yarn 
with only one machine?


Hi Deng,


Thanks for the help. Actually I need pay more attention to memory usage.

I found the root cause in my problem. It seemed that it existed in spark 
streaming MQTTUtils module.

When I use "localhost" in brokerURL, it doesn't work.

After change it to "127.0.0.1", it works now.


Thanks again,

Jared



________________________________
From: odeach...@gmail.com <odeach...@gmail.com> on behalf of Deng Ching-Mallete 
<och...@apache.org>
Sent: Tuesday, July 5, 2016 4:03:28 PM
To: Yu Wei
Cc: user@spark.apache.org
Subject: Re: Is that possible to launch spark streaming application on yarn 
with only one machine?

Hi Jared,

You can launch a Spark application even with just a single node in YARN, 
provided that the node has enough resources to run the job.

It might also be good to note that when YARN calculates the memory allocation 
for the driver and the executors, there is an additional memory overhead that 
is added for each executor then it gets rounded up to the nearest GB, IIRC. So 
the 4G driver-memory + 4x2G executor memory do not necessarily translate to a 
total of 12G memory allocation. It would be more than that, so the node would 
need to have more than 12G of memory for the job to execute in YARN. You should 
be able to see something like "No resources available in cluster.." in the 
application master logs in YARN if that is the case.

HTH,
Deng

On Tue, Jul 5, 2016 at 4:31 PM, Yu Wei 
<yu20...@hotmail.com<mailto:yu20...@hotmail.com>> wrote:

Hi guys,

I set up pseudo hadoop/yarn cluster on my labtop.

I wrote a simple spark streaming program as below to receive messages with 
MQTTUtils.

conf = new SparkConf().setAppName("Monitor&Control");
jssc = new JavaStreamingContext(conf, Durations.seconds(1));
JavaReceiverInputDStream<String> inputDS = MQTTUtils.createStream(jssc, 
brokerUrl, topic);

inputDS.print();
jssc.start();
jssc.awaitTermination()


If I submitted the app with "--master local[2]", it works well.

spark-submit --master local[4] --driver-memory 4g --executor-memory 2g 
--num-executors 4 target/CollAna-1.0-SNAPSHOT.jar

If I submitted with "--master yarn",  no output for "inputDS.print()".

spark-submit --master yarn --deploy-mode cluster --driver-memory 4g 
--executor-memory 2g --num-executors 4 target/CollAna-1.0-SNAPSHOT.jar

Is it possible to launch spark application on yarn with only one single node?


Thanks for your advice.


Jared


Reply via email to