Thank you very much for your reply. I have a cluster of 8 nodes: m1, m2, m3.. m8. m1 configured as Spark master node, the rest of the nodes are all worker node. I also configured m3 as the History Server. But the history server fails to start.I ran FlumeEventCount in m1 using the right hostname and a port that is not used by any application. Here is the script I used to run FlumeEventCount:
#!/bin/bash spark-submit --verbose --class org.apache.spark.examples.streaming.FlumeEventCount --deploy-mode client --master yarn-client --jars lib/spark-streaming-flume_2.10-1.1.0-cdh5.2.2-20141112.193826-1.jar /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/spark/examples/lib/spark-examples-1.1.0-cdh5.2.0-hadoop2.5.0-cdh5.2.0.jar m1.ptang.aerohive.com 1234 Same issue observed after added "spark.ui.port=4321" in /etc/spark/conf/spark-defaults.conf. Followings are the exceptions from the job run: 14/12/01 11:51:45 INFO JobScheduler: Finished job streaming job 1417463504000 ms.0 from job set of time 1417463504000 ms 14/12/01 11:51:45 INFO JobScheduler: Total delay: 1.465 s for time 1417463504000 ms (execution: 1.415 s) 14/12/01 11:51:45 ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - org.jboss.netty.channel.ChannelException: Failed to bind to: m1.ptang.aerohive.com/192.168.10.22:1234 at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:106) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:119) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:74) at org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:68) at org.apache.spark.streaming.flume.FlumeReceiver.initServer(FlumeInputDStream.scala:164) at org.apache.spark.streaming.flume.FlumeReceiver.onStart(FlumeInputDStream.scala:171) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:264) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:257) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:180) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290) at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) ... 3 more 14/12/01 11:51:45 WARN TaskSetManager: Lost task 0.0 in stage 4.0 (TID 72, m8.ptang.aerohive.com): org.jboss.netty.channel.ChannelException: Failed to bind to: m1.ptang.aerohive.com/192.168.10.22:1234 org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:106) org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:119) org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:74) org.apache.avro.ipc.NettyServer.<init>(NettyServer.java:68) org.apache.spark.streaming.flume.FlumeReceiver.initServer(FlumeInputDStream.scala:164) org.apache.spark.streaming.flume.FlumeReceiver.onStart(FlumeInputDStream.scala:171) org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121) org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106) org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:264) org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:257) org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121) org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) org.apache.spark.scheduler.Task.run(Task.scala:54) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:180) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) Thanks Ping From: Prannoy <pran...@sigmoidanalytics.com<mailto:pran...@sigmoidanalytics.com>> Date: Friday, November 28, 2014 at 12:56 AM To: "u...@spark.incubator.apache.org<mailto:u...@spark.incubator.apache.org>" <u...@spark.incubator.apache.org<mailto:u...@spark.incubator.apache.org>> Subject: Re: How to use FlumeInputDStream in spark cluster? Hi, BindException comes when two processes are using the same port. In your spark configuration just set ("spark.ui.port","xxxxx"), to some other port. xxxxx can be any number say 12345. BindException will not break your job in either case. Just to fix it change the port number. Thanks. On Fri, Nov 28, 2014 at 1:30 PM, pamtang [via Apache Spark User List] <[hidden email]</user/SendEmail.jtp?type=node&node=19999&i=0>> wrote: I'm seeing the same issue on CDH 5.2 with Spark 1.1. FlumeEventCount works fine on a Standalone cluster but throw BindException on YARN mode. Is there a solution to this problem or FlumeInputDStream will not be working in a cluster environment? ________________________________ If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-FlumeInputDStream-in-spark-cluster-tp1604p19997.html To start a new topic under Apache Spark User List, email [hidden email]</user/SendEmail.jtp?type=node&node=19999&i=1> To unsubscribe from Apache Spark User List, click here. NAML<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> ________________________________ View this message in context: Re: How to use FlumeInputDStream in spark cluster?<http://apache-spark-user-list.1001560.n3.nabble.com/How-to-use-FlumeInputDStream-in-spark-cluster-tp1604p19999.html> Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.