It's not configurable yet, but will be in the upcoming 0.23.0 release.

On Fri, Jul 17, 2015 at 3:46 PM, Nastooh Avessta (navesta) <
nave...@cisco.com> wrote:

>  Hi
>
> Trying to adjust the current failover time to below 10 seconds and don’t
> seem to be able to find the right set of parameters. Currently, it takes
> around minute and half for master to detect that a slave has gone offline,
> which seems to correspond to
> slave_ping_timeout=15*max_slave_ping_timeouts=5. However, I can’t find
> these parameters in mesos-master:
>
>
>
> # mesos-master --version
>
> mesos 0.22.1
>
> #mesos-master --help
>
> Usage: mesos-master [...]
>
>
>
> Supported options:
>
>   --acls=VALUE                             The value could be a JSON
> formatted string of ACLs
>
>                                            or a file path containing the
> JSON formatted ACLs used
>
>                                            for authorization. Path could
> be of the form 'file:///path/to/file'
>
>                                            or '/path/to/file'.
>
>
>
>                                            See the ACLs protobuf in
> mesos.proto for the expected format.
>
>
>
>                                            Example:
>
>                                            {
>
>                                              "register_frameworks": [
>
>                                                                   {
>
>
> "principals": { "type": "ANY" },
>
>
> "roles": { "values": ["a"] }
>
>                                                                   }
>
>                                                                 ],
>
>                                              "run_tasks": [
>
>                                                              {
>
>
>                                                        "principals": {
> "values": ["a", "b"] },
>
>                                                                 "users": {
> "values": ["c"] }
>
>                                                              }
>
>                                                            ],
>
>                                              "shutdown_frameworks": [
>
>                                                            {
>
>
> "principals": { "values": ["a", "b"] },
>
>
> "framework_principals": { "values": ["c"] }
>
>                                                            }
>
>                                                          ]
>
>                                            }
>
>   --allocation_interval=VALUE              Amount of time to wait between
> performing
>
>                                             (batch) allocations (e.g.,
> 500ms, 1sec, etc). (default: 1secs)
>
>   --[no-]authenticate                      If authenticate is 'true' only
> authenticated frameworks are allowed
>
>                                            to register. If 'false'
> unauthenticated frameworks are also
>
>                                            allowed to register. (default:
> false)
>
>   --[no-]authenticate_slaves               If 'true' only authenticated
> slaves are allowed to register.
>
>                                            If 'false' unauthenticated
> slaves are also allowed to register. (default: false)
>
>   --authenticators=VALUE                   Authenticator implementation to
> use when authenticating frameworks
>
>                                            and/or slaves. Use the default
> 'crammd5', or
>
>                                            load an alternate authenticator
> module using --modules. (default: crammd5)
>
>   --cluster=VALUE                          Human readable name for the
> cluster,
>
>                                            displayed in the webui.
>
>   --credentials=VALUE                      Either a path to a text file
> with a list of credentials,
>
>                                            each line containing
> 'principal' and 'secret' separated by whitespace,
>
>                                            or, a path to a JSON-formatted
> file containing credentials.
>
>                                            Path could be of the form
> 'file:///path/to/file' or '/path/to/file'.
>
>                                            JSON file Example:
>
>                                            {
>
>                                              "credentials": [
>
>                                                                {
>
>
> "principal": "sherman",
>
>
>              "secret": "kitesurf",
>
>                                                                }
>
>                                                               ]
>
>                                            }
>
>                                            Text file Example:
>
>                                            username secret
>
>
>
>   --external_log_file=VALUE                Specified the externally
> managed log file. This file will be
>
>                                            exposed in the webui and HTTP
> api. This is useful when using
>
>                                            stderr logging as the log file
> is otherwise unknown to Mesos.
>
>   --framework_sorter=VALUE                 Policy to use for allocating
> resources
>
>                                            between a given user's
> frameworks. Options
>
>                                            are the same as for
> user_allocator. (default: drf)
>
>   --[no-]help                              Prints this help message
> (default: false)
>
>   --hooks=VALUE                            A comma separated list of hook
> modules to be
>
>                                            installed inside master.
>
>   --hostname=VALUE                         The hostname the master should
> advertise in ZooKeeper.
>
>                                            If left unset, the hostname is
> resolved from the IP address
>
>                                            that the master binds to.
>
>   --[no-]initialize_driver_logging         Whether to automatically
> initialize google logging of scheduler
>
>                                            and/or executor drivers.
> (default: true)
>
>   --ip=VALUE                               IP address to listen on
>
>   --[no-]log_auto_initialize               Whether to automatically
> initialize the replicated log used for the
>
>                                            registry. If this is set to
> false, the log has to be manually
>
>                                            initialized when used for the
> very first time. (default: true)
>
>   --log_dir=VALUE                          Directory path to put log files
> (no default, nothing
>
>                                            is written to disk unless
> specified;
>
>                                            does not affect logging to
> stderr).
>
>                                            NOTE: 3rd party log messages
> (e.g. ZooKeeper) are
>
>                                            only written to stderr!
>
>
>
>   --logbufsecs=VALUE                       How many seconds to buffer log
> messages for (default: 0)
>
>   --logging_level=VALUE                    Log message at or above this
> level; possible values:
>
>                                            'INFO', 'WARNING', 'ERROR'; if
> quiet flag is used, this
>
>                                            will affect just the logs from
> log_dir (if specified) (default: INFO)
>
>   --modules=VALUE                          List of modules to be loaded
> and be available to the internal
>
>                                            subsystems.
>
>
>
>                                            Use --modules=filepath to
> specify the list of modules via a
>
>                                            file containing a JSON
> formatted string. 'filepath' can be
>
>                                            of the form
> 'file:///path/to/file' or '/path/to/file'.
>
>
>
>                                            Use --modules="{...}" to
> specify the list of modules inline.
>
>
>
>                                            Example:
>
>                                            {
>
>                                              "libraries": [
>
>                                                {
>
>                                                  "file":
> "/path/to/libfoo.so",
>
>                                                  "modules": [
>
>                                                    {
>
>                                                      "name":
> "org_apache_mesos_bar",
>
>                                                      "parameters": [
>
>                                                        {
>
>                                                          "key": "X",
>
>                                                          "value": "Y"
>
>                                                        }
>
>                                                      ]
>
>                                                    },
>
>                                                    {
>
>                                                      "name":
> "org_apache_mesos_baz"
>
>                                                    }
>
>                                                  ]
>
>                                                },
>
>                                                {
>
>                                                  "name": "qux",
>
>                                                  "modules": [
>
>                                                    {
>
>                                                      "name":
> "org_apache_mesos_norf"
>
>                                                    }
>
>                                                  ]
>
>                                                }
>
>                                              ]
>
>                                            }
>
>   --offer_timeout=VALUE                    Duration of time before an
> offer is rescinded from a framework.
>
>                                            This helps fairness when
> running frameworks that hold on to offers,
>
>                                            or frameworks that accidentally
> drop offers.
>
>   --port=VALUE                             Port to listen on (default:
> 5050)
>
>   --[no-]quiet                             Disable logging to stderr
> (default: false)
>
>   --quorum=VALUE                           The size of the quorum of
> replicas when using 'replicated_log' based
>
>                                            registry. It is imperative to
> set this value to be a majority of
>
>                                            masters i.e., quorum > (number
> of masters)/2.
>
>   --rate_limits=VALUE                      The value could be a JSON
> formatted string of rate limits
>
>                                            or a file path containing the
> JSON formatted rate limits used
>
>                                            for framework rate limiting.
>
>                                            Path could be of the form
> 'file:///path/to/file'
>
>                                            or '/path/to/file'.
>
>
>
>                                            See the RateLimits protobuf in
> mesos.proto for the expected format.
>
>
>
>                                            Example:
>
>                                            {
>
>                                              "limits": [
>
>                                                {
>
>                                                  "principal": "foo",
>
>                                                  "qps": 55.5
>
>                                                },
>
>                                                {
>
>                                                  "principal": "bar"
>
>                                                }
>
>                                              ],
>
>                                              "aggregate_default_qps": 33.3
>
>                                            }
>
>   --recovery_slave_removal_limit=VALUE     For failovers, limit on the
> percentage of slaves that can be removed
>
>                                            from the registry *and*
> shutdown after the re-registration timeout
>
>                                            elapses. If the limit is
> exceeded, the master will fail over rather
>
>                                            than remove the slaves.
>
>                                            This can be used to provide
> safety guarantees for production
>
>                                            environments. Production
> environments may expect that across Master
>
>                                            failovers, at most a certain
> percentage of slaves will fail
>
>                                            permanently (e.g. due to
> rack-level failures).
>
>                                            Setting this limit would ensure
> that a human needs to get
>
>                                            involved should an unexpected
> widespread failure of slaves occur
>
>                                            in the cluster.
>
>                                            Values: [0%-100%] (default:
> 100%)
>
>   --registry=VALUE                         Persistence strategy for the
> registry;
>
>                                            available options are
> 'replicated_log', 'in_memory' (for testing). (default: replicated_log)
>
>   --registry_fetch_timeout=VALUE           Duration of time to wait in
> order to fetch data from the registry
>
>                                            after which the operation is
> considered a failure. (default: 1mins)
>
>   --registry_store_timeout=VALUE           Duration of time to wait in
> order to store data in the registry
>
>                                            after which the operation is
> considered a failure. (default: 5secs)
>
>   --[no-]registry_strict                   Whether the Master will take
> actions based on the persistent
>
>                                            information stored in the
> Registry. Setting this to false means
>
>                                            that the Registrar will never
> reject the admission, readmission,
>
>                                            or removal of a slave.
> Consequently, 'false' can be used to
>
>                                            bootstrap the persistent state
> on a running cluster.
>
>                                            NOTE: This flag is
> *experimental* and should not be used in
>
>                                            production yet. (default: false)
>
>   --roles=VALUE                            A comma separated list of the
> allocation
>
>                                            roles that frameworks in this
> cluster may
>
>                                            belong to.
>
>   --[no-]root_submissions                  Can root submit frameworks?
> (default: true)
>
>   --slave_removal_rate_limit=VALUE         The maximum rate (e.g.,
> 1/10mins, 2/3hrs, etc) at which slaves will
>
>                                            be removed from the master when
> they fail health checks. By default
>
>                                            slaves will be removed as soon
> as they fail the health checks.
>
>                                            The value is of the form
> <Number of slaves>/<Duration>.
>
>   --slave_reregister_timeout=VALUE         The timeout within which all
> slaves are expected to re-register
>
>                                            when a new master is elected as
> the leader. Slaves that do not
>
>                                            re-register within the timeout
> will be removed from the registry
>
>                                            and will be shutdown if they
> attempt to communicate with master.
>
>                                            NOTE: This value has to be
> atleast 10mins. (default: 10mins)
>
>   --user_sorter=VALUE                      Policy to use for allocating
> resources
>
>                                            between users. May be one of:
>
>                                              dominant_resource_fairness
> (drf) (default: drf)
>
>   --[no-]version                           Show version and exit.
> (default: false)
>
>   --webui_dir=VALUE                        Directory path of the webui
> files/assets (default: /usr/share/mesos/webui)
>
>   --weights=VALUE                          A comma separated list of
> role/weight pairs
>
>                                            of the form
> 'role=weight,role=weight'. Weights
>
>                                            are used to indicate forms of
> priority.
>
>   --whitelist=VALUE                        Path to a file with a list of
> slaves
>
>                                            (one per line) to advertise
> offers for.
>
>                                            Path could be of the form
> 'file:///path/to/file' or '/path/to/file'.
>
>   --work_dir=VALUE                         Directory path to store the
> persistent information stored in the
>
>                                            Registry. (example:
> /var/lib/mesos/master)
>
>   --zk=VALUE                               ZooKeeper URL (used for leader
> election amongst masters)
>
>                                            May be one of:
>
>
> zk://host1:port1,host2:port2,.../path
>
>                                              zk://username:password@host1
> :port1,host2:port2,.../path
>
>                                              file:///path/to/file (where
> file contains one of the above)
>
>   --zk_session_timeout=VALUE               ZooKeeper session timeout.
> (default: 10secs)
>
>
>
> Furthermore, setting these parameter either in /etc/mesos-master/ or
> inline generates the following error:
>
> # /usr/sbin/mesos-master --zk=zk://10.40.50.228:2181/mesos --port=5050
> --log_dir=/var/log/mesos --hostname=10.40.50.228 --ip=10.40.50.228
> --quorum=1 --work
>
> _dir=/var/lib/mesos --max_slave_ping_timeouts=2
>
> Failed to load unknown flag 'max_slave_ping_timeouts'
>
> Usage: mesos-master [...]
>
>
>
> Supported options:
>
>   --acls=VALUE                             The valu
>
> …
>
>
>
> Any thoughts?
>
> Cheers,
>
> [image: http://www.cisco.com/web/europe/images/email/signature/logo05.jpg]
>
> *Nastooh Avessta*
> ENGINEER.SOFTWARE ENGINEERING
> nave...@cisco.com
> Phone: *+1 604 647 1527 <%2B1%20604%20647%201527>*
>
> *Cisco Systems Limited*
> 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121
> VANCOUVER
> BRITISH COLUMBIA
> V7X 1J1
> CA
> Cisco.com <http://www.cisco.com/>
>
>
>
> [image: Think before you print.]Think before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/index.html
>
> Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J
> 2T3. Phone: 416-306-7000; Fax: 416-306-7099. *Preferences
> <http://www.cisco.com/offer/subscribe/?sid=000478326> - Unsubscribe
> <http://www.cisco.com/offer/unsubscribe/?sid=000478327> – Privacy
> <http://www.cisco.com/web/siteassets/legal/privacy.html>*
>
>
>

Reply via email to