Re: Is it this a BUG?: Why Spark Flume Streaming job is not deploying the Receiver to the specified host?

diplomatic Guru Tue, 18 Aug 2015 17:26:34 -0700

Thank you Tathagata for your response.  Yes, I'm using push model on Spark
1.2. For my scenario I do prefer the push model. Is this the case on the
later version 1.4 too?

I think I can find a workaround for this issue but only if I know how to
obtain the worker(executor) ID. I can get the detail of the driver like
this:

*ss.ssc().env().blockManager().blockManagerId().host()*

*But not sure how I could the executor Id from the driver.*

*When the job is submitted, I can see that blockmanager being registered
with the Driver and Executor IP address:*

*15/08/18 23:31:40 INFO YarnClientSchedulerBackend: Registered executor:
Actor[akka.tcp://sparkExecutor@05151113997207:41630/user/Executor#1210147506]
with ID 115/08/18 23:31:40 INFO RackResolver: Resolved 05151113997207 to
/0513_R-0050/RJ0515/08/18 23:31:41 INFO BlockManagerMasterActor:
Registering block manager 05151113997207:56921 with 530.3 MB RAM,
BlockManagerId(1, 05151113997207, 56921)The BlockManagerMasterActor appears
to be doing the registering. Is there anyway I can access this from the
SparkContext?Thanks.*

On 18 August 2015 at 22:40, Tathagata Das <t...@databricks.com> wrote:

> Are you using the Flume polling stream or the older stream?
>
> Such problems of binding used to occur in the older push-based approach,
> hence we built the polling stream (pull-based).
>
>
> On Tue, Aug 18, 2015 at 4:45 AM, diplomatic Guru <diplomaticg...@gmail.com
> > wrote:
>
>> I'm testing the Flume + Spark integration example (flume count).
>>
>> I'm deploying the job using yarn cluster mode.
>>
>> I first logged into the Yarn cluster, then submitted the job and passed
>> in a specific worker node's IP to deploy the job. But when I checked the
>> WebUI, it failed to bind to the specified IP because the receiver was
>> deployed to a different host, not the one I asked it to. Do you know?
>>
>> For your information,  I've also tried passing the IP address used by the
>> resource manager to find resources but no joy. But when I set the host to
>> 'localhost' and deploy to the cluster it is binding a worker node that is
>> selected by the resource manager.
>>
>>
>>
>

Re: Is it this a BUG?: Why Spark Flume Streaming job is not deploying the Receiver to the specified host?

Reply via email to