> On Jan. 27, 2014, 2:43 p.m., Ivan Kelly wrote:
> > I don't like how this patch deals with the fundamental problem, which is 
> > that we're not keeping track of our state in our code. PCBC, BookieClient, 
> > LedgerHandle, BookKeeper, LedgerManager are all state machines. PCBC is 
> > complex, but most have the states (READY, CLOSING, CLOSED).
> > 
> > Basically, while in the READY state they should accept new user requests. 
> > In the CLOSING state they should start shutting down, and then when all 
> > resources have been freed they should move the the CLOSED state.
> > 
> > For example, with PCBC, when in the READY state (which is CONNECTED in the 
> > code), if it recieves a close request, it should move to the CLOSING state, 
> > error out all requests and close the channel, when all these ops are 
> > complete it should move to CLOSED.
> > 
> > Similarly, in READY state, BookieClient should serve requests. A close 
> > request moves it to CLOSING during which it waits for all PCBC to close 
> > before moving into CLOSED.
> > 
> > In the patch you did something similar for CleanupLedgerManager, except you 
> > used a boolean rather than a explicit state field.
> > 
> > It's BookKeeper and LedgerHandle which are problematic here, since the 
> > state is scattered all over the place. I think if we cleanup BookieClient & 
> > PCBC as stated above and use the CleanupLedgerManager, we should be able to 
> > stop catching the runtimeexceptions. Actually the BookieClient changes go a 
> > lot of the way, but the problem is addEntry which reach into PCBC then 
> > reach back out around again to call into PCBC.
> > 
> > This is a little rambly, but to summarize, I think the patch deals with the 
> > problem from the wrong direction in places. We shouldn't be catching 
> > RejectedExectionExceptions because we should be tracking our state enough 
> > not to need to.
> 
> Sijie Guo wrote:
>     I don't know if you really understand what the patch is doing or if you 
> read the comments in the jira. it is fixing the fundamental root cause that 
> we shutdown client in wrong sequence, which cause the problematic in ledger 
> handle and bookkeeper. All the operations in ledger handle and bookkeeper are 
> either metadata operations, or data operations, or the callback chained with 
> metadata/data operations. if we keep tracked the state of ledger manager and 
> bookie client (PCBC) and fixing the wrong closing sequence issue, why it is a 
> wrong direction?
>     
>     For catching rejected exceptions, I think they are inherited from 
> previous patches (which is indeed dealing with the issue in wrong direction). 
> I think they are good to be kept there, so we would have logs that could help 
> us debugging if we are not dealing with the states correctly.
>     
>     As current design, the states in pcbc are channel states, not client 
> state. if a channel is in closed state, it just means the channel is broken 
> and not connected, it doesn't mean that the client is closed. the flag that I 
> added is used for the client state which we could reject any incoming request 
> if the client is to be closed permanently (by calling #close). distinguishing 
> a channel state from client state is good if we want to move support multiple 
> channels to a bookie. 
>     
>     >>> I think if we cleanup BookieClient & PCBC as stated above and use the 
> CleanupLedgerManager, we should be able to stop catching the 
> runtimeexceptions.
>     
>     again, if you read the patch correctly, isn't the patch to make 
> BookieClient/PCBC handling close correctly and use CleanupLedgerManager? so I 
> don't know what is you real point?
> 
> fpj wrote:
>     I like the idea of reasoning about this problem using state machines; 
> however, I'm not convinced that considering four separate state machines is 
> the right thing to do. We can think of an overall state machine for the 
> client (e.g., CONNECTING, READY, CLOSED) and state machines per component.
> 
> Sijie Guo wrote:
>     is there any difference between using a flag than a state for the whole 
> client? or stepping back, do we really need so many state for the whole 
> client? 
>     
>     will the state in ledger manager and the state in bookie client (pcbc) 
> not be enough to define the state of the whole client?
> 
> fpj wrote:
>     It wouldn't be surprising if we can do with one state machine for the 
> whole client. My thinking is that it might be useful for one global client to 
> translate into multiple states for a component. For example, the state of the 
> overall client might be CLOSED, meaning that the app can't submit new 
> requests, but the state of BookieClient/PCBC might be CLOSING because we are 
> still erroring out pending requests. Making such a distinction might not be 
> strictly necessary, but it doesn't sound wrong to have it.
> 
> Sijie Guo wrote:
>     the reason that I only did for BookieClient(PCBC) and LedgerManager is 
> for performance consideration. Since BookieClient(PCBC) and LedgerManager 
> already defined the state of BookKeeper (All data operations will be rejected 
> when BookieClient is closing, while all metadata operations will be rejected 
> when LedgerManager is closing), why we need additional layer for it?
> 
> Ivan Kelly wrote:
>     re: flags vs state, flags are state. Once you go past one flag, you 
> should move to a more explicit state mechanism. 
>     
>     You cannot do this with one state machine for the whole client. Each 
> individual channel has it's own state, so each need their own state machine. 
> In fact, pcbc already is a state machine.
>     
>     I think that state machines for bookie client and ledgermanager may be 
> enough for now. Changing it for the whole client is a pretty mammoth task.

We need a state machine to represent the states the user sees, like starting, 
ready, closed. You may claim that this is the state machine of bookkeeper, but 
I would say that this state machine we expose reflects the state of the overall 
client. You can surely have a different state machine per component and map the 
states we expose to applications to states of the different components.


- fpj


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17352/#review32830
-----------------------------------------------------------


On Jan. 25, 2014, 7:59 a.m., Sijie Guo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17352/
> -----------------------------------------------------------
> 
> (Updated Jan. 25, 2014, 7:59 a.m.)
> 
> 
> Review request for bookkeeper, fpj and Ivan Kelly.
> 
> 
> Bugs: BOOKKEEPER-654
>     https://issues.apache.org/jira/browse/BOOKKEEPER-654
> 
> 
> Repository: bookkeeper-git
> 
> 
> Description
> -------
> 
> the correct close sequence should be:
> 
> 1) close the bookie client to error out all pending bookie requests, and 
> after bookie client is close, all following request would be rejected.
> 2) close the ledger manager which erred out all pending all metadata 
> requests, and after ledger manager is close, all metadata request would be 
> rejected.
> 3) close scheduler.
> 
> attach a patch following this sequence.
> 
> 
> Diffs
> -----
> 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BKException.java 
> ace1409 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 
> a91861c 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
>  c5f5233 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookieWatcher.java
>  cfb6022 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerChecker.java
>  cfb9128 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerFragmentReplicator.java
>  4a4eb49 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java
>  bf4bd97 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java
>  5b8a703 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingAddOp.java
>  8f1f18a 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java
>  6cf6c1b 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java
>  a077556 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/CleanupLedgerManager.java
>  PRE-CREATION 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieClient.java 
> 696bcc2 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java
>  d8ebaf6 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/util/OrderedSafeExecutor.java
>  ac068c9 
>   
> bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperCloseTest.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17352/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sijie Guo
> 
>

Reply via email to