Re: Is it this a BUG?: Why Spark Flume Streaming job is not deploying the Receiver to the specified host?

2015-08-18 Thread Tathagata Das
Are you using the Flume polling stream or the older stream? Such problems of binding used to occur in the older push-based approach, hence we built the polling stream (pull-based). On Tue, Aug 18, 2015 at 4:45 AM, diplomatic Guru diplomaticg...@gmail.com wrote: I'm testing the Flume + Spark

Re: Is it this a BUG?: Why Spark Flume Streaming job is not deploying the Receiver to the specified host?

2015-08-18 Thread diplomatic Guru
Thank you Tathagata for your response. Yes, I'm using push model on Spark 1.2. For my scenario I do prefer the push model. Is this the case on the later version 1.4 too? I think I can find a workaround for this issue but only if I know how to obtain the worker(executor) ID. I can get the detail

Re: Is it this a BUG?: Why Spark Flume Streaming job is not deploying the Receiver to the specified host?

2015-08-18 Thread Tathagata Das
I dont think there is a super clean way for doing this. Here is an idea. Run a dummy job with large number of partitions/tasks, which will access SparkEnv.get.blockManager().blockManagerId().host() and return it. sc.makeRDD(1 to 100, 100).map { _ =