Are you using the Flume polling stream or the older stream?
Such problems of binding used to occur in the older push-based approach,
hence we built the polling stream (pull-based).
On Tue, Aug 18, 2015 at 4:45 AM, diplomatic Guru diplomaticg...@gmail.com
wrote:
I'm testing the Flume + Spark
Thank you Tathagata for your response. Yes, I'm using push model on Spark
1.2. For my scenario I do prefer the push model. Is this the case on the
later version 1.4 too?
I think I can find a workaround for this issue but only if I know how to
obtain the worker(executor) ID. I can get the detail
I dont think there is a super clean way for doing this. Here is an idea.
Run a dummy job with large number of partitions/tasks, which will access
SparkEnv.get.blockManager().blockManagerId().host() and return it.
sc.makeRDD(1 to 100, 100).map { _ =