[ https://issues.apache.org/jira/browse/MESOS-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Park updated MESOS-7123: -------------------------------- Target Version/s: 1.4.0 (was: 1.3.0) > Investigate splitting offer messages instead of sending a giant single > resource offer message. > ---------------------------------------------------------------------------------------------- > > Key: MESOS-7123 > URL: https://issues.apache.org/jira/browse/MESOS-7123 > Project: Mesos > Issue Type: Improvement > Reporter: Anand Mazumdar > Priority: Critical > Labels: mesosphere > > Currently, the Mesos master batches all the resource offers into a single > message and then sends it to the scheduler. However, for large clusters this > can be problematic as this message can exceed the maximum allowed default > protobuf message size (~64mb). When such a message reaches the scheduler, > it's dropped with a warning followed by a failed invariant check. > {noformat} > [libprotobuf ERROR google/protobuf/io/coded_stream.cc:180] A protocol message > was rejected because it was too big (more than 67108864 bytes). To increase > the limit (or to disable these warnings), see > CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stre > am.h. > F0213 21:33:57.658892 60996 sched.cpp:895] Check failed: offers.size() == > pids.size() (32664 vs. 0) > *** Check failure stack trace: *** > @ 0x7f8d1b4d69bd (unknown) > @ 0x7f8d1b4d8750 (unknown) > @ 0x7f8d1b4d6582 (unknown) > @ 0x7f8d1b4d90e9 (unknown) > @ 0x7f8d1aaa646c (unknown) > @ 0x7f8d1aaa7df7 (unknown) > @ 0x7f8d1aa8ee4a (unknown) > @ 0x7f8d1aa9d109 (unknown) > @ 0x7f8d1b46e4e4 (unknown) > @ 0x7f8d1b46e827 (unknown) > @ 0x7f8e319b0220 (unknown) > @ 0x7f8e3355ddc5 start_thread > @ 0x7f8e32c62ced __clone > @ (nil) (unknown) > {noformat} > Possible solutions can be to either batch the offers e.g., 100 offers per > message or have a N:1 mapping ie., 1 offer per message by the Mesos master. > The batch size can be set via a master flag at startup with a reasonable > default value. -- This message was sent by Atlassian JIRA (v6.3.15#6346)