Here is a MWE: https://github.com/zmanji/mesos-mwe
Follow the instructions in the README to reproduce. On Sun, Aug 14, 2016 at 9:04 PM, Dario Rexin <dre...@apple.com> wrote: > Can you post the code somewhere? > > > On Aug 14, 2016, at 8:54 PM, Zameer Manji <zma...@apache.org> wrote: > > Dario, > > The logs show that no disconnections occur until after the second POST > request. I would expect a log entry indicating a disconnect between the two > POST requests if the stream id changed. > > On Sun, Aug 14, 2016 at 8:28 PM, Dario Rexin <dre...@apple.com> wrote: > >> You’re absolutely right. I just tried the exact same steps and it worked >> fine for me. I also don’t see the log message. Do you have any reconnection >> logic in place? Is it possible, that your framework reconnected before you >> send the call? The Stream Id would change in that case. >> >> >> On Aug 14, 2016, at 6:48 PM, Zameer Manji <zma...@apache.org> wrote: >> >> Dario, >> >> I do not think the case sensitivity matters here. If the master was >> expecting a header that was exactly 'Mesos-Stream-Id' and did not see it, I >> would expect to get the error response: `All non-subscribe calls should >> include the 'Mesos-Stream-Id' header`. That is the error response that you >> get when you do not set the header. >> >> Possibly related, I expected to see the stream id in the mesos logs. I >> see this log message >> <https://github.com/apache/mesos/blob/c9b70582e9fccab8f6863b0bd3a812b5969a8c24/src/master/master.cpp#L7473-L7474> >> in >> the code, but I do not see it in the logs. >> >> >> On Sun, Aug 14, 2016 at 6:12 PM, Dario Rexin <dre...@apple.com> wrote: >> >>> Oh, sorry, I didn't see you actually set the header (wall of text ;) ). >>> That's an interesting issue, do you set the header case sensitive? I know >>> headers shouldn't be case sensitive, but maybe there's a bug in the Mesos >>> code. I have not seen this issue before. >>> >>> On Aug 14, 2016, at 5:58 PM, Zameer Manji <zma...@apache.org> wrote: >>> >>> Hey, >>> >>> I'm using the Mesos HTTP API for the first time. I am currently >>> encountering an issue where after a successful SUBSCRIBE call and receiving >>> a SUBSCRIBED and HEARTBEAT event, a subsequent TEARDOWN call fails with >>> HTTP 400 with a message of "The stream ID included in this request didn't >>> match the stream ID currently associated with framework ID". >>> >>> Here is a detailed breakdown of what happens with logs: >>> >>> A new framework sends an SUBSCRIBE call with the following body: >>> >>> ```` >>> framework_id { >>> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >>> } >>> type: SUBSCRIBE >>> subscribe { >>> framework_info { >>> user: "user" >>> name: "name" >>> id { >>> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >>> } >>> } >>> } >>> ```` >>> >>> It then receives a 200 OK response with the following headers: >>> `{content-type=[application/x-protobuf], date=[Sat, 13 Aug 2016 >>> 02:42:48 GMT], transfer-encoding=[chunked], mesos-stream-id=[71a0294f-e9c4 >>> -4efe-b237-fb120836aaf8]}` >>> >>> Over this connection it receives a successful subscribed event: >>> ```` >>> type: SUBSCRIBED >>> subscribed { >>> framework_id { >>> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >>> } >>> heartbeat_interval_seconds: 15.0 >>> } >>> ```` >>> >>> It also receives a single heart beat event. >>> >>> Then it tries to send the following request: >>> ```` >>> Sending: framework_id { >>> value: "0dffbee9-a514-4ffa-87e1-2850dd4dcf00" >>> } >>> type: TEARDOWN >>> ```` >>> with the following headers: >>> `{accept=[application/x-protobuf], accept-encoding=[gzip], >>> mesos-stream-id=[71a0294f-e9c4-4efe-b237-fb120836aaf8]}` >>> >>> The response is a 400 with the body: `The stream ID included in this >>> request didn't match the stream ID currently associated with framework ID >>> '0dffbee9-a514-4ffa-87e1-2850dd4dcf00'`. >>> >>> >>> The master logs contains: >>> ```` >>> I0813 02:42:48.376819 13934 http.cpp:381] HTTP POST for >>> /master/api/v1/scheduler from 192.168.33.1:60780 with >>> User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)' >>> I0813 02:42:48.376998 13934 master.cpp:2146] Received subscription >>> request for HTTP framework 'name' >>> I0813 02:42:48.377104 13934 master.cpp:2244] Subscribing framework >>> 'name' with checkpointing disabled and capabilities [ ] >>> I0813 02:42:48.377378 13934 hierarchical.cpp:271] Added framework >>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >>> I0813 02:42:49.475163 13929 http.cpp:381] HTTP POST for >>> /master/api/v1/scheduler from 192.168.33.1:60782 with >>> User-Agent='Google-HTTP-Java-Client/1.20.0 (gzip)' >>> I0813 02:42:51.133513 13930 master.cpp:1284] Framework >>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) disconnected >>> I0813 02:42:51.133597 13930 master.cpp:2725] Disconnecting framework >>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >>> I0813 02:42:51.133618 13930 master.cpp:2749] Deactivating framework >>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >>> I0813 02:42:51.133644 13930 master.cpp:1297] Giving framework >>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) 0ns to failover >>> I0813 02:42:51.133692 13932 hierarchical.cpp:382] Deactivated framework >>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >>> I0813 02:42:51.137265 13931 master.cpp:5561] Framework failover timeout, >>> removing framework 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >>> I0813 02:42:51.137339 13931 master.cpp:6296] Removing framework >>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 (name) >>> I0813 02:42:51.137464 13931 hierarchical.cpp:333] Removed framework >>> 0dffbee9-a514-4ffa-87e1-2850dd4dcf00 >>> ```` >>> Note the immediate disconnection after the second POST is intentional. >>> >>> This is with Mesos 1.0.0 on Ubuntu Trusty. >>> >>> What can I do to debug this issue? The logs do not provide a lot of >>> information to act on. The stream id generated by mesos is not in the logs, >>> nor anything indicating that an HTTP 400 was sent. >>> >>> -- >>> Zameer Manji >>> >>> >> >> > >