Hi guys,

In my recent blog post (http://mandubian.com/2014/03/08/zpark-ml-nio-1/), I
needed to have an InputDStream helper looking like NetworkInputDStream to
be able to push my data into DStream in an async way. But I didn't want the
remoting aspect as my data source runs locally and nowhere else. I didn't
want my InputDStream to be considered as a NetworkInputDStream as they have
a special management in DStream scheduler to be potentially remoted.

So I've implemented this LocalInputDStream that provides simple push with
an receiver based on an actor, creating BlockRDD but ensures it won't be
remoted:

https://github.com/mandubian/zpark-ztream/blob/master/src/main/scala/LocalInputDStream.scala

(the code is just a hack of NetworkInputDStream)

and a instance of it:
https://github.com/mandubian/zpark-ztream/blob/master/src/main/scala/ZparkInputDStream.scala

Is it something useful for the spark-streaming project that I could
contribute to the project (in a PR) or have I totally missed something that
would do the same in current project code?

Best regards
Pascal

Reply via email to