> On March 6, 2018, 2:44 a.m., Zhitao Li wrote:
> > src/master/http.cpp
> > Lines 845-857 (original), 845-861 (patched)
> > <https://reviews.apache.org/r/61262/diff/8/?file=1800242#file1800242line845>
> >
> >     I believe this is a behavior change.
> >     
> >     Previous, `master->subscribe(...)` call will not trigger additional 
> > event being sent, so `SUBSCRIBED` is still guaranteed to be the first event.
> >     
> >     However, we start the `heartbeater` process immediately without wait 
> > for `SUBSCRIBED` to be sent, so this could mean that subscriber can receive 
> > the `SUBSCRIBED` event after heartbeater sends something.
> >     
> >     Should reorder the call so we only call `master->subscribe(http);` 
> > after the `http.send` on `SUBSCRIBED`?
> 
> Greg Mann wrote:
>     Reordering is an option, but I wonder if this will really be an issue in 
> practice? We delay the first heartbeat event sent by the Heartbeater by 
> DEFAULT_HEARTBEAT_INTERVAL: 
> https://github.com/apache/mesos/blob/0d247c3887ea08b6273992218cd5899010d89fed/src/master/master.hpp#L1986
>     so the Heartbeater sends its first heartbeat event after that interval 
> elapses. So, the heartbeat event could only arrive first in the presence of 
> some extreme CPU contention, it doesn't seem likely/possible to me? WDYT?

I think in large cluster, the first SUBSCRIBED message could take longer than 
`DEFAULT_HEARTBEAT_INTERVAL` under extreme cases. In practice this has hit our 
cluster maybe less than 10% of master failovers.

Still, I would argue that we should ensure `SUBSCRIBED` message is sent before 
any `HEARTBEAT`, also because subscriber cannot event process the `HEARTBEAT` 
without the `heartbeat_interval_seconds` value from `SUBSCRIBED` message.


- Zhitao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61262/#review198678
-----------------------------------------------------------


On Aug. 18, 2017, 6:54 p.m., Quinn Leng wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61262/
> -----------------------------------------------------------
> 
> (Updated Aug. 18, 2017, 6:54 p.m.)
> 
> 
> Review request for mesos, Anand Mazumdar and Greg Mann.
> 
> 
> Bugs: MESOS-7695
>     https://issues.apache.org/jira/browse/MESOS-7695
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added the 'HEARTBEAT' event for the operator API, modified other
> related test cases to accept heartbeats.
> 
> 
> Diffs
> -----
> 
>   include/mesos/master/master.proto fc5bd894ce55fe8e946d4c5b4b33d3c0505f3c2b 
>   include/mesos/v1/master/master.proto 
> c3fb31de2509adcdec8204f8bbe46b46a31540e8 
>   src/master/http.cpp 959091c8ec03b6ac7bcb5d21b04d2f7d5aff7d54 
>   src/master/master.hpp b802fd153a10f6012cea381f153c28cc78cae995 
>   src/tests/api_tests.cpp 3ab4740bcac29ecb89585da6adb1f563d6fc1f5f 
> 
> 
> Diff: https://reviews.apache.org/r/61262/diff/8/
> 
> 
> Testing
> -------
> 
> make check -j48
> 
> 
> Thanks,
> 
> Quinn Leng
> 
>

Reply via email to