Hi Vinti,

That example is (in my opinion) more of a tutorial and not necessarily the
way you'd want to set it up for a "real world" application. I'd recommend
using something like Apache Kafka, which will allow the various hosts to
publish messages to a queue. Your Spark Streaming application is then
receiving messages from the queue and performing whatever processing you'd
like.

http://kafka.apache.org/documentation.html#introduction

Thanks,
Kevin

On Tue, Feb 23, 2016 at 3:13 PM, Vinti Maheshwari <vinti.u...@gmail.com>
wrote:

> Hi All
>
> I wrote program for Spark Streaming in Scala. In my program, i passed
> 'remote-host' and 'remote port' under socketTextStream.
>
> And in the remote machine, i have one perl script who is calling system
> command:
>
> echo 'data_str' | nc <remote_host> <9999>
>
> In that way, my spark program is able to get data, but it seems little bit
> confusing as i have multiple remote machines which needs to send data to
> spark machine. I wanted to know the right way of doing it. Infact, how will
> i deal with data coming from multiple hosts?
>
> For Reference, My current program:
>
> def main(args: Array[String]): Unit = {
>     val conf = new SparkConf().setAppName("HBaseStream")
>     val sc = new SparkContext(conf)
>
>     val ssc = new StreamingContext(sc, Seconds(2))
>
>     val inputStream = ssc.socketTextStream(<remote-host>, 9999)
>     -------------------
>     -------------------
>
>     ssc.start()
>     // Wait for the computation to terminate
>     ssc.awaitTermination()
>
>   }}
>
> Thanks in advance.
>
> Regards,
> ~Vinti
>

Reply via email to