Hi all, imagine the following situation: I am a framework with failover timeout of 1 hour, and 59 minutes and 55 seconds after shutting down I want to register with the master again.
If my registration attempt arrives at the master within the time limit everything will be fine and I even get back the old tasks for reconciliation, but if it arrives slightly later the framework id is permanently blocked by mesos, and I am not able to register. Instead, I will receive an error()-callback with the message "Framework has been removed". Is there any way to reliably connect to the master while also reconciling old tasks if possible? I was looking around how other frameworks solve this, but it seems that Kafka doesn't handle this at all (https://dcosjira.atlassian.net/browse/KAFKA-4), and Marathon scans the error message for the string "Framework has been removed" and changes the framework id in this case. If the latter is the intended solution, are these strings considered part of the mesos API? Is it guaranteed they will not be changed after the 1.0 release? Best regards, Benno
