> On Jan. 27, 2014, 2:43 p.m., Ivan Kelly wrote: > > I don't like how this patch deals with the fundamental problem, which is > > that we're not keeping track of our state in our code. PCBC, BookieClient, > > LedgerHandle, BookKeeper, LedgerManager are all state machines. PCBC is > > complex, but most have the states (READY, CLOSING, CLOSED). > > > > Basically, while in the READY state they should accept new user requests. > > In the CLOSING state they should start shutting down, and then when all > > resources have been freed they should move the the CLOSED state. > > > > For example, with PCBC, when in the READY state (which is CONNECTED in the > > code), if it recieves a close request, it should move to the CLOSING state, > > error out all requests and close the channel, when all these ops are > > complete it should move to CLOSED. > > > > Similarly, in READY state, BookieClient should serve requests. A close > > request moves it to CLOSING during which it waits for all PCBC to close > > before moving into CLOSED. > > > > In the patch you did something similar for CleanupLedgerManager, except you > > used a boolean rather than a explicit state field. > > > > It's BookKeeper and LedgerHandle which are problematic here, since the > > state is scattered all over the place. I think if we cleanup BookieClient & > > PCBC as stated above and use the CleanupLedgerManager, we should be able to > > stop catching the runtimeexceptions. Actually the BookieClient changes go a > > lot of the way, but the problem is addEntry which reach into PCBC then > > reach back out around again to call into PCBC. > > > > This is a little rambly, but to summarize, I think the patch deals with the > > problem from the wrong direction in places. We shouldn't be catching > > RejectedExectionExceptions because we should be tracking our state enough > > not to need to. > > Sijie Guo wrote: > I don't know if you really understand what the patch is doing or if you > read the comments in the jira. it is fixing the fundamental root cause that > we shutdown client in wrong sequence, which cause the problematic in ledger > handle and bookkeeper. All the operations in ledger handle and bookkeeper are > either metadata operations, or data operations, or the callback chained with > metadata/data operations. if we keep tracked the state of ledger manager and > bookie client (PCBC) and fixing the wrong closing sequence issue, why it is a > wrong direction? > > For catching rejected exceptions, I think they are inherited from > previous patches (which is indeed dealing with the issue in wrong direction). > I think they are good to be kept there, so we would have logs that could help > us debugging if we are not dealing with the states correctly. > > As current design, the states in pcbc are channel states, not client > state. if a channel is in closed state, it just means the channel is broken > and not connected, it doesn't mean that the client is closed. the flag that I > added is used for the client state which we could reject any incoming request > if the client is to be closed permanently (by calling #close). distinguishing > a channel state from client state is good if we want to move support multiple > channels to a bookie. > > >>> I think if we cleanup BookieClient & PCBC as stated above and use the > CleanupLedgerManager, we should be able to stop catching the > runtimeexceptions. > > again, if you read the patch correctly, isn't the patch to make > BookieClient/PCBC handling close correctly and use CleanupLedgerManager? so I > don't know what is you real point? > > fpj wrote: > I like the idea of reasoning about this problem using state machines; > however, I'm not convinced that considering four separate state machines is > the right thing to do. We can think of an overall state machine for the > client (e.g., CONNECTING, READY, CLOSED) and state machines per component. > > Sijie Guo wrote: > is there any difference between using a flag than a state for the whole > client? or stepping back, do we really need so many state for the whole > client? > > will the state in ledger manager and the state in bookie client (pcbc) > not be enough to define the state of the whole client? > > fpj wrote: > It wouldn't be surprising if we can do with one state machine for the > whole client. My thinking is that it might be useful for one global client to > translate into multiple states for a component. For example, the state of the > overall client might be CLOSED, meaning that the app can't submit new > requests, but the state of BookieClient/PCBC might be CLOSING because we are > still erroring out pending requests. Making such a distinction might not be > strictly necessary, but it doesn't sound wrong to have it. > > Sijie Guo wrote: > the reason that I only did for BookieClient(PCBC) and LedgerManager is > for performance consideration. Since BookieClient(PCBC) and LedgerManager > already defined the state of BookKeeper (All data operations will be rejected > when BookieClient is closing, while all metadata operations will be rejected > when LedgerManager is closing), why we need additional layer for it? > > Ivan Kelly wrote: > re: flags vs state, flags are state. Once you go past one flag, you > should move to a more explicit state mechanism. > > You cannot do this with one state machine for the whole client. Each > individual channel has it's own state, so each need their own state machine. > In fact, pcbc already is a state machine. > > I think that state machines for bookie client and ledgermanager may be > enough for now. Changing it for the whole client is a pretty mammoth task.
We need a state machine to represent the states the user sees, like starting, ready, closed. You may claim that this is the state machine of bookkeeper, but I would say that this state machine we expose reflects the state of the overall client. You can surely have a different state machine per component and map the states we expose to applications to states of the different components. - fpj ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17352/#review32830 ----------------------------------------------------------- On Jan. 25, 2014, 7:59 a.m., Sijie Guo wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17352/ > ----------------------------------------------------------- > > (Updated Jan. 25, 2014, 7:59 a.m.) > > > Review request for bookkeeper, fpj and Ivan Kelly. > > > Bugs: BOOKKEEPER-654 > https://issues.apache.org/jira/browse/BOOKKEEPER-654 > > > Repository: bookkeeper-git > > > Description > ------- > > the correct close sequence should be: > > 1) close the bookie client to error out all pending bookie requests, and > after bookie client is close, all following request would be rejected. > 2) close the ledger manager which erred out all pending all metadata > requests, and after ledger manager is close, all metadata request would be > rejected. > 3) close scheduler. > > attach a patch following this sequence. > > > Diffs > ----- > > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BKException.java > ace1409 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java > a91861c > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java > c5f5233 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookieWatcher.java > cfb6022 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerChecker.java > cfb9128 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerFragmentReplicator.java > 4a4eb49 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java > bf4bd97 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java > 5b8a703 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingAddOp.java > 8f1f18a > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java > 6cf6c1b > > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java > a077556 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/CleanupLedgerManager.java > PRE-CREATION > > bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieClient.java > 696bcc2 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java > d8ebaf6 > > bookkeeper-server/src/main/java/org/apache/bookkeeper/util/OrderedSafeExecutor.java > ac068c9 > > bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperCloseTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/17352/diff/ > > > Testing > ------- > > > Thanks, > > Sijie Guo > >
