[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15244311#comment-15244311 ] haosdent commented on MESOS-1865: - Rebased just now. May I have your reviews [~adam-mesos] > Redirect to the leader master when current master is not a leader > - > > Key: MESOS-1865 > URL: https://issues.apache.org/jira/browse/MESOS-1865 > Project: Mesos > Issue Type: Bug > Components: json api >Affects Versions: 0.20.1 >Reporter: Steven Schlansker >Assignee: haosdent > > Some of the API endpoints, for example /master/tasks.json, will return bogus > information if you query a non-leading master: > {code} > [steven@Anesthetize:~]% curl > http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [ > { > "executor_id": "", > "framework_id": "20140724-231003-419644938-5050-1707-", > "id": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "name": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "resources": { > "cpus": 0.25, > "disk": 0, > {code} > This is very hard for end-users to work around. For example if I query > "which master is leading" followed by "leader: which tasks are running" it is > possible that the leader fails over in between, leaving me with an incorrect > answer and no way to know that this happened. > In my opinion the API should return the correct response (by asking the > current leader?) or an error (500 Not the leader?) but it's unacceptable to > return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242360#comment-15242360 ] haosdent commented on MESOS-1865: - Yes, current the patch use TEMPORARY_REDIRECT 307 as status code. Let me rebase the patch. > Redirect to the leader master when current master is not a leader > - > > Key: MESOS-1865 > URL: https://issues.apache.org/jira/browse/MESOS-1865 > Project: Mesos > Issue Type: Bug > Components: json api >Affects Versions: 0.20.1 >Reporter: Steven Schlansker >Assignee: haosdent > > Some of the API endpoints, for example /master/tasks.json, will return bogus > information if you query a non-leading master: > {code} > [steven@Anesthetize:~]% curl > http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [ > { > "executor_id": "", > "framework_id": "20140724-231003-419644938-5050-1707-", > "id": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "name": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "resources": { > "cpus": 0.25, > "disk": 0, > {code} > This is very hard for end-users to work around. For example if I query > "which master is leading" followed by "leader: which tasks are running" it is > possible that the leader fails over in between, leaving me with an incorrect > answer and no way to know that this happened. > In my opinion the API should return the correct response (by asking the > current leader?) or an error (500 Not the leader?) but it's unacceptable to > return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241654#comment-15241654 ] Cody Maloney commented on MESOS-1865: - Please not 301 "permanent redirect". Browsers cache that for a _long_ time so if that leader becomes master again you'll be permanently redirected away... 302 or 307. If we're concerned about breaking "dump" / simple clients than 307 would seem to make the most sense. The odds are better that simple clients wouldn't know about 307 since it's newer, and just report as an error which a sysadmin would see in their monitoring tools and be able to fix. > Redirect to the leader master when current master is not a leader > - > > Key: MESOS-1865 > URL: https://issues.apache.org/jira/browse/MESOS-1865 > Project: Mesos > Issue Type: Bug > Components: json api >Affects Versions: 0.20.1 >Reporter: Steven Schlansker >Assignee: haosdent > > Some of the API endpoints, for example /master/tasks.json, will return bogus > information if you query a non-leading master: > {code} > [steven@Anesthetize:~]% curl > http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [ > { > "executor_id": "", > "framework_id": "20140724-231003-419644938-5050-1707-", > "id": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "name": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "resources": { > "cpus": 0.25, > "disk": 0, > {code} > This is very hard for end-users to work around. For example if I query > "which master is leading" followed by "leader: which tasks are running" it is > possible that the leader fails over in between, leaving me with an incorrect > answer and no way to know that this happened. > In my opinion the API should return the correct response (by asking the > current leader?) or an error (500 Not the leader?) but it's unacceptable to > return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241459#comment-15241459 ] Neil Conway commented on MESOS-1865: I'm not sure -- I think you could make a case for either 301 or 307. Using 301 makes a certain amount of sense, in that the client should probably continue to use the new master address until further notice (i.e., until a new 301 redirect is seen). > Redirect to the leader master when current master is not a leader > - > > Key: MESOS-1865 > URL: https://issues.apache.org/jira/browse/MESOS-1865 > Project: Mesos > Issue Type: Bug > Components: json api >Affects Versions: 0.20.1 >Reporter: Steven Schlansker >Assignee: haosdent > > Some of the API endpoints, for example /master/tasks.json, will return bogus > information if you query a non-leading master: > {code} > [steven@Anesthetize:~]% curl > http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [ > { > "executor_id": "", > "framework_id": "20140724-231003-419644938-5050-1707-", > "id": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "name": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "resources": { > "cpus": 0.25, > "disk": 0, > {code} > This is very hard for end-users to work around. For example if I query > "which master is leading" followed by "leader: which tasks are running" it is > possible that the leader fails over in between, leaving me with an incorrect > answer and no way to know that this happened. > In my opinion the API should return the correct response (by asking the > current leader?) or an error (500 Not the leader?) but it's unacceptable to > return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241456#comment-15241456 ] Steven Schlansker commented on MESOS-1865: -- 301 is supposed to be "permanent" -- whereas the leader will continue to move over time. Would 307 (Temporary Redirect) be more appropriate? > Redirect to the leader master when current master is not a leader > - > > Key: MESOS-1865 > URL: https://issues.apache.org/jira/browse/MESOS-1865 > Project: Mesos > Issue Type: Bug > Components: json api >Affects Versions: 0.20.1 >Reporter: Steven Schlansker >Assignee: haosdent > > Some of the API endpoints, for example /master/tasks.json, will return bogus > information if you query a non-leading master: > {code} > [steven@Anesthetize:~]% curl > http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [ > { > "executor_id": "", > "framework_id": "20140724-231003-419644938-5050-1707-", > "id": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "name": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "resources": { > "cpus": 0.25, > "disk": 0, > {code} > This is very hard for end-users to work around. For example if I query > "which master is leading" followed by "leader: which tasks are running" it is > possible that the leader fails over in between, leaving me with an incorrect > answer and no way to know that this happened. > In my opinion the API should return the correct response (by asking the > current leader?) or an error (500 Not the leader?) but it's unacceptable to > return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241321#comment-15241321 ] Neil Conway commented on MESOS-1865: An HTTP redirect makes sense to me as well. Which of the 3xx status codes to use seems open to debate. 301 makes sense, although clients are not supposed to automatically follow the redirect for PUT/POST requests. 308 allows clients to follow the redirect automatically for all types of requests -- but I'm not sure how widely implemented 308 is just yet. > Redirect to the leader master when current master is not a leader > - > > Key: MESOS-1865 > URL: https://issues.apache.org/jira/browse/MESOS-1865 > Project: Mesos > Issue Type: Bug > Components: json api >Affects Versions: 0.20.1 >Reporter: Steven Schlansker >Assignee: haosdent > > Some of the API endpoints, for example /master/tasks.json, will return bogus > information if you query a non-leading master: > {code} > [steven@Anesthetize:~]% curl > http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [ > { > "executor_id": "", > "framework_id": "20140724-231003-419644938-5050-1707-", > "id": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "name": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "resources": { > "cpus": 0.25, > "disk": 0, > {code} > This is very hard for end-users to work around. For example if I query > "which master is leading" followed by "leader: which tasks are running" it is > possible that the leader fails over in between, leaving me with an incorrect > answer and no way to know that this happened. > In my opinion the API should return the correct response (by asking the > current leader?) or an error (500 Not the leader?) but it's unacceptable to > return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240997#comment-15240997 ] Adam B commented on MESOS-1865: --- I think a redirect makes the most sense, and as Cody says, a client can always just choose not to follow the redirect. Even the 301 MovedPermanently status code is clearer (says "I'm not the leader anymore; use this URI from now on") than BadRequestUnauthorized/Forbidden/NotFound/InternalServerError. Although I agree that 200 with empty data is flat out WRONG, I cannot imagine a more appropriate status code than 301 Moved Permanently. Perhaps [~idownes] knows of a more appropriate status code? > Redirect to the leader master when current master is not a leader > - > > Key: MESOS-1865 > URL: https://issues.apache.org/jira/browse/MESOS-1865 > Project: Mesos > Issue Type: Bug > Components: json api >Affects Versions: 0.20.1 >Reporter: Steven Schlansker >Assignee: haosdent > > Some of the API endpoints, for example /master/tasks.json, will return bogus > information if you query a non-leading master: > {code} > [steven@Anesthetize:~]% curl > http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [] > } > [steven@Anesthetize:~]% curl > http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n > 10 > { > "tasks": [ > { > "executor_id": "", > "framework_id": "20140724-231003-419644938-5050-1707-", > "id": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "name": > "pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db", > "resources": { > "cpus": 0.25, > "disk": 0, > {code} > This is very hard for end-users to work around. For example if I query > "which master is leading" followed by "leader: which tasks are running" it is > possible that the leader fails over in between, leaving me with an incorrect > answer and no way to know that this happened. > In my opinion the API should return the correct response (by asking the > current leader?) or an error (500 Not the leader?) but it's unacceptable to > return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14610770#comment-14610770 ] Ian Downes commented on MESOS-1865: --- Sorry for the drive by comment but my 2c is that if you're querying a *specific* master that is not leading then it should *not* return (through a redirect) information from the leading master. Yes, it should do something other than successfully returning empty data but I don't think it should be a redirect. If you want automatic redirection to the leading master then that should be done at a higher level. Redirect to the leader master when current master is not a leader - Key: MESOS-1865 URL: https://issues.apache.org/jira/browse/MESOS-1865 Project: Mesos Issue Type: Bug Components: json api Affects Versions: 0.20.1 Reporter: Steven Schlansker Assignee: haosdent Some of the API endpoints, for example /master/tasks.json, will return bogus information if you query a non-leading master: {code} [steven@Anesthetize:~]% curl http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n 10 { tasks: [] } [steven@Anesthetize:~]% curl http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n 10 { tasks: [] } [steven@Anesthetize:~]% curl http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n 10 { tasks: [ { executor_id: , framework_id: 20140724-231003-419644938-5050-1707-, id: pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db, name: pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db, resources: { cpus: 0.25, disk: 0, {code} This is very hard for end-users to work around. For example if I query which master is leading followed by leader: which tasks are running it is possible that the leader fails over in between, leaving me with an incorrect answer and no way to know that this happened. In my opinion the API should return the correct response (by asking the current leader?) or an error (500 Not the leader?) but it's unacceptable to return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1865) Redirect to the leader master when current master is not a leader
[ https://issues.apache.org/jira/browse/MESOS-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14611005#comment-14611005 ] Cody Maloney commented on MESOS-1865: - Following a redirect is entirely a client's choice. Practically in HTTP there isn't a better alternative I know of that keeps simple / dumb clients working well. Right now a number of dumb client programs which want to pull master/state.json manually call out to find out what the leading master is from the master, then going to that directly and hoping there isn't a race around it. Practically for systems which care to only monitor the exact master they are talking to, most HTTP libraries I have seen you can disable automatic redirect following. Currently these APIs sometimes returning incorrect / invalid / stale data has caused problems for things like proxy config generation scripts (They get the wrong master at just the wrong point in time and generate an empty config, leading to badness) Redirect to the leader master when current master is not a leader - Key: MESOS-1865 URL: https://issues.apache.org/jira/browse/MESOS-1865 Project: Mesos Issue Type: Bug Components: json api Affects Versions: 0.20.1 Reporter: Steven Schlansker Assignee: haosdent Some of the API endpoints, for example /master/tasks.json, will return bogus information if you query a non-leading master: {code} [steven@Anesthetize:~]% curl http://master1.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n 10 { tasks: [] } [steven@Anesthetize:~]% curl http://master2.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n 10 { tasks: [] } [steven@Anesthetize:~]% curl http://master3.mesos-vpcqa.otenv.com:5050/master/tasks.json | jq . | head -n 10 { tasks: [ { executor_id: , framework_id: 20140724-231003-419644938-5050-1707-, id: pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db, name: pp.guestcenterwebhealthmonitor.606cd6ee-4b50-11e4-825b-5212e05f35db, resources: { cpus: 0.25, disk: 0, {code} This is very hard for end-users to work around. For example if I query which master is leading followed by leader: which tasks are running it is possible that the leader fails over in between, leaving me with an incorrect answer and no way to know that this happened. In my opinion the API should return the correct response (by asking the current leader?) or an error (500 Not the leader?) but it's unacceptable to return a successful wrong answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)