> On Jan. 27, 2014, 2:43 p.m., Ivan Kelly wrote:
> > I don't like how this patch deals with the fundamental problem, which is 
> > that we're not keeping track of our state in our code. PCBC, BookieClient, 
> > LedgerHandle, BookKeeper, LedgerManager are all state machines. PCBC is 
> > complex, but most have the states (READY, CLOSING, CLOSED).
> > 
> > Basically, while in the READY state they should accept new user requests. 
> > In the CLOSING state they should start shutting down, and then when all 
> > resources have been freed they should move the the CLOSED state.
> > 
> > For example, with PCBC, when in the READY state (which is CONNECTED in the 
> > code), if it recieves a close request, it should move to the CLOSING state, 
> > error out all requests and close the channel, when all these ops are 
> > complete it should move to CLOSED.
> > 
> > Similarly, in READY state, BookieClient should serve requests. A close 
> > request moves it to CLOSING during which it waits for all PCBC to close 
> > before moving into CLOSED.
> > 
> > In the patch you did something similar for CleanupLedgerManager, except you 
> > used a boolean rather than a explicit state field.
> > 
> > It's BookKeeper and LedgerHandle which are problematic here, since the 
> > state is scattered all over the place. I think if we cleanup BookieClient & 
> > PCBC as stated above and use the CleanupLedgerManager, we should be able to 
> > stop catching the runtimeexceptions. Actually the BookieClient changes go a 
> > lot of the way, but the problem is addEntry which reach into PCBC then 
> > reach back out around again to call into PCBC.
> > 
> > This is a little rambly, but to summarize, I think the patch deals with the 
> > problem from the wrong direction in places. We shouldn't be catching 
> > RejectedExectionExceptions because we should be tracking our state enough 
> > not to need to.
> 
> Sijie Guo wrote:
>     I don't know if you really understand what the patch is doing or if you 
> read the comments in the jira. it is fixing the fundamental root cause that 
> we shutdown client in wrong sequence, which cause the problematic in ledger 
> handle and bookkeeper. All the operations in ledger handle and bookkeeper are 
> either metadata operations, or data operations, or the callback chained with 
> metadata/data operations. if we keep tracked the state of ledger manager and 
> bookie client (PCBC) and fixing the wrong closing sequence issue, why it is a 
> wrong direction?
>     
>     For catching rejected exceptions, I think they are inherited from 
> previous patches (which is indeed dealing with the issue in wrong direction). 
> I think they are good to be kept there, so we would have logs that could help 
> us debugging if we are not dealing with the states correctly.
>     
>     As current design, the states in pcbc are channel states, not client 
> state. if a channel is in closed state, it just means the channel is broken 
> and not connected, it doesn't mean that the client is closed. the flag that I 
> added is used for the client state which we could reject any incoming request 
> if the client is to be closed permanently (by calling #close). distinguishing 
> a channel state from client state is good if we want to move support multiple 
> channels to a bookie. 
>     
>     >>> I think if we cleanup BookieClient & PCBC as stated above and use the 
> CleanupLedgerManager, we should be able to stop catching the 
> runtimeexceptions.
>     
>     again, if you read the patch correctly, isn't the patch to make 
> BookieClient/PCBC handling close correctly and use CleanupLedgerManager? so I 
> don't know what is you real point?
> 
> fpj wrote:
>     I like the idea of reasoning about this problem using state machines; 
> however, I'm not convinced that considering four separate state machines is 
> the right thing to do. We can think of an overall state machine for the 
> client (e.g., CONNECTING, READY, CLOSED) and state machines per component.
> 
> Sijie Guo wrote:
>     is there any difference between using a flag than a state for the whole 
> client? or stepping back, do we really need so many state for the whole 
> client? 
>     
>     will the state in ledger manager and the state in bookie client (pcbc) 
> not be enough to define the state of the whole client?
> 
> fpj wrote:
>     It wouldn't be surprising if we can do with one state machine for the 
> whole client. My thinking is that it might be useful for one global client to 
> translate into multiple states for a component. For example, the state of the 
> overall client might be CLOSED, meaning that the app can't submit new 
> requests, but the state of BookieClient/PCBC might be CLOSING because we are 
> still erroring out pending requests. Making such a distinction might not be 
> strictly necessary, but it doesn't sound wrong to have it.
> 
> Sijie Guo wrote:
>     the reason that I only did for BookieClient(PCBC) and LedgerManager is 
> for performance consideration. Since BookieClient(PCBC) and LedgerManager 
> already defined the state of BookKeeper (All data operations will be rejected 
> when BookieClient is closing, while all metadata operations will be rejected 
> when LedgerManager is closing), why we need additional layer for it?

re: flags vs state, flags are state. Once you go past one flag, you should move 
to a more explicit state mechanism. 

You cannot do this with one state machine for the whole client. Each individual 
channel has it's own state, so each need their own state machine. In fact, pcbc 
already is a state machine.

I think that state machines for bookie client and ledgermanager may be enough 
for now. Changing it for the whole client is a pretty mammoth task. 


- Ivan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17352/#review32830
-----------------------------------------------------------


On Jan. 25, 2014, 7:59 a.m., Sijie Guo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17352/
> -----------------------------------------------------------
> 
> (Updated Jan. 25, 2014, 7:59 a.m.)
> 
> 
> Review request for bookkeeper, fpj and Ivan Kelly.
> 
> 
> Bugs: BOOKKEEPER-654
>     https://issues.apache.org/jira/browse/BOOKKEEPER-654
> 
> 
> Repository: bookkeeper-git
> 
> 
> Description
> -------
> 
> the correct close sequence should be:
> 
> 1) close the bookie client to error out all pending bookie requests, and 
> after bookie client is close, all following request would be rejected.
> 2) close the ledger manager which erred out all pending all metadata 
> requests, and after ledger manager is close, all metadata request would be 
> rejected.
> 3) close scheduler.
> 
> attach a patch following this sequence.
> 
> 
> Diffs
> -----
> 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BKException.java 
> ace1409 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 
> a91861c 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
>  c5f5233 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookieWatcher.java
>  cfb6022 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerChecker.java
>  cfb9128 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerFragmentReplicator.java
>  4a4eb49 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java
>  bf4bd97 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java
>  5b8a703 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingAddOp.java
>  8f1f18a 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java
>  6cf6c1b 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java
>  a077556 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/CleanupLedgerManager.java
>  PRE-CREATION 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieClient.java 
> 696bcc2 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java
>  d8ebaf6 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/util/OrderedSafeExecutor.java
>  ac068c9 
>   
> bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookKeeperCloseTest.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17352/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sijie Guo
> 
>

Reply via email to