I see, thanks for the clarification TD. On 24 Feb 2015 09:56, "Tathagata Das" <t...@databricks.com> wrote:
> Akhil, that is incorrect. > > Spark will list on the given port for Flume to push data into it. > When in local mode, it will listen on localhost:9999 > When in some kind of cluster, instead of localhost you will have to give > the hostname of the cluster node where you want Flume to forward the data. > Spark will launch the Flume receiver on that node (assuming the hostname > matching is correct), and list on port 9999, for receiving data from Flume. > So only the configured machine will listen on port 9999. > > I suggest trying the other stream. FlumeUtils.createPollingStream. More > details here. > http://spark.apache.org/docs/latest/streaming-flume-integration.html > > > > On Sat, Feb 21, 2015 at 12:17 AM, Akhil Das <ak...@sigmoidanalytics.com> > wrote: > >> Spark won't listen on 9999 mate, It basically means you have a flume >> source running at port 9999 of your localhost. And when you submit your >> application in standalone mode, workers will consume date from that port. >> >> Thanks >> Best Regards >> >> On Sat, Feb 21, 2015 at 9:22 AM, bit1...@163.com <bit1...@163.com> wrote: >> >>> >>> Hi, >>> In the spark streaming application, I write the code, >>> FlumeUtils.createStream(ssc,"localhost",9999),which >>> means spark will listen on the 9999 port, and wait for Flume Sink to write >>> to it. >>> My question is: when I submit the application to the Spark Standalone >>> cluster, will 9999 be opened only on the Driver Machine or all the workers >>> will also open the 9999 port and wait for the Flume data? >>> >>> ------------------------------ >>> >>> >> >