----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14686/ -----------------------------------------------------------
Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone. Repository: mesos-git Description ------- This is the second patch of a series of patches that implement a catch-up mechanism for replicated log. This patch fixed a race condition caused by passing process wrapper pointers (e.g. Network, Replica, etc.). In our code base, one commonly used routine is to use a wrapper class to wrap a libprocess so that we don't need to worry about whether we should directly call the function or should use a dispatch. For example: class FooProcess : public Process<FooProcess> { int foo(); }; class Foo { Future<int> foo() { return dispatch(process, &FooProcess::foo); } ... FooProcess* process; }; If multiple entities want to share a libprocess (e.g. FooProcess), currently, we pass a pointer of its wrapper (e.g. Foo*) to the users. However, this is problematic in some cases. The main reason is that we don't know when it is safe to destroy the libprocess (FooProcess) and delete the wrapper (Foo) if a pointer of this wrapper has been passed to some other entities in the system and we don't know whether they will use it or not later. There is a race condition in our first patch. Here is the buggy interleaving: T1 T2 appending.discard(); ExplicitPromiseProcess::initialize() { onDiscarded() { terminate ExplicitPromiseProcess } delete network; ... network->broadcast(protocol::promise, request) } (We added a new test CoordinatorTest.FillTimeout which can reproduce this bug and will segfault without this patch.) To fix this problem, we introduce a concept called ProcessHandle to generalize process wrappers (see process.hpp). It internally maintains a shared pointer to the underlying libprocess. It is copyable. Multiple copies of the same handle share the same underlying libprocess. When the last copy is deleted, we will terminate the underlying process and delete the libprocess. If a user wants to early terminate the libprocess, he can explicitly call the terminate() function. Now, creating a wrapper for a libprocess is simply as that: class Foo : public ProcessHandle<FooProcess> { ... }; This change won't break the existing code (i.e., if you use the old way of passing wrapper pointers, this change is like a noop). This is a joint work with Yan Xu. Diffs ----- 3rdparty/libprocess/include/process/process.hpp b502324 src/Makefile.am a2d8242 src/log/catchup.hpp PRE-CREATION src/log/catchup.cpp PRE-CREATION src/log/consensus.hpp PRE-CREATION src/log/consensus.cpp PRE-CREATION src/log/coordinator.hpp 3f6fb7c src/log/coordinator.cpp 6e6466f src/log/log.hpp 77edc7a src/log/network.hpp d34cf78 src/log/replica.hpp d1f5ead src/log/replica.cpp 59a6ff3 src/log/zookeeper.hpp PRE-CREATION src/log/zookeeper.cpp PRE-CREATION src/tests/log_tests.cpp ff5f86c Diff: https://reviews.apache.org/r/14686/diff/ Testing ------- bin/mesos-tests.sh --gtest_filter=CoordinatorTest*:ReplicaTest*:LogTest* --gtest_repeat=100 Thanks, Jie Yu