[ 
https://issues.apache.org/jira/browse/MESOS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15672101#comment-15672101
 ] 

Joseph Wu commented on MESOS-6596:
----------------------------------

The actual error message is coming from this:
https://github.com/apache/mesos/blob/1.0.x/src/common/resources.cpp#L972-L995

The 409 you're seeing is likely explained by this little comment:
https://github.com/apache/mesos/blob/1.0.x/src/master/allocator/mesos/hierarchical.cpp#L695-L712

Basically, as the master is processing your request, the allocator decides to 
offer the same resources to someone else.  This might be a problem if your 
cluster has high offer churn (accepting/declining offers at high rates).  To 
mitigate this race, we'd have to more tightly synchronize the master and 
allocator processes.

> Dynamic reservation endpoint returns 409s
> -----------------------------------------
>
>                 Key: MESOS-6596
>                 URL: https://issues.apache.org/jira/browse/MESOS-6596
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>            Reporter: Kunal Thakar
>
> The operation to dynamically reserve a host for a framework consistently 
> fails, but succeeds sometimes.
> We are calling the /reserve endpoint on the master with the same payload and 
> it mostly returns 409, with the occasional success. Pasting the output of two 
> consecutive /reserve calls:
> {code}
> * About to connect() to computexxx-yyy port 5050 (#0)
> *   Trying 10.184.21.3... connected
> * Server auth using Basic with user 'cassandra'
> > POST /master/reserve HTTP/1.1
> > Authorization: Basic blah
> > User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.2j 
> > zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> > Host: computexxx-yyy:5050
> > Accept: */*
> > Content-Length: 1046
> > Content-Type: application/x-www-form-urlencoded
> > Expect: 100-continue
> >
> * Done waiting for 100-continue
> < HTTP/1.1 409 Conflict
> HTTP/1.1 409 Conflict
> < Date: Tue, 15 Nov 2016 23:07:10 GMT
> Date: Tue, 15 Nov 2016 23:07:10 GMT
> < Content-Type: text/plain; charset=utf-8
> Content-Type: text/plain; charset=utf-8
> < Content-Length: 58
> Content-Length: 58
> * HTTP error before end of send, stop sending
> <
> * Closing connection #0
> Invalid RESERVE Operation:  does not contain mem(*):120621
> {code}
> {code}
> * About to connect() to computexxx-yyy port 5050 (#0)
> *   Trying 10.184.21.3... connected
> * Server auth using Basic with user 'cassandra'
> > POST /master/reserve HTTP/1.1
> > Authorization: Basic blah
> > User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.2j 
> > zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> > Host: computexxx-yyy:5050
> > Accept: */*
> > Content-Length: 1046
> > Content-Type: application/x-www-form-urlencoded
> > Expect: 100-continue
> >
> * Done waiting for 100-continue
> < HTTP/1.1 202 Accepted
> HTTP/1.1 202 Accepted
> < Date: Tue, 15 Nov 2016 23:07:16 GMT
> Date: Tue, 15 Nov 2016 23:07:16 GMT
> < Content-Length: 0
> Content-Length: 0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to