Re: Review Request 17895: BOOKKEEPER-582: protobuf support for bookkeeper

Sijie Guo Tue, 27 May 2014 10:21:44 -0700


> On April 23, 2014, 10:22 a.m., Ivan Kelly wrote:
> > bookkeeper-server/src/main/proto/BookkeeperProtocol.proto, line 80
> > <https://reviews.apache.org/r/17895/diff/2/?file=563033#file563033line80>
> >
> >     This would be better served with a boolean. protobuf will keep it 
> > compact.
> 
> Sijie Guo wrote:
>     being Flag will make it easier to add other flags in future.
> 
> Ivan Kelly wrote:
>     No it doesn't. Lets say you do add another flag in the future. How do you 
> specify that it's both a FENCE_LEDGER and NEW_FLAG? You need the field to be 
> repeated, and then you need code to loop through the flags. Compared to 
> msg.isFenceLedger(), this seems much more cumbersome.
> 
> Sijie Guo wrote:
>     I don't say this field is bits-set-like of flags. as the long poll 
> changes we mentioned, we are going to add a new flag 'ENTRY_PIGGYBACK' for 
> read. so a read request is either a normal read (without Flag), a fence read 
> (flag = FENCE_LEDGER) or a piggyback read (flag = ENTRY_PIGGYBACK). Being 
> repeated will be kind of complicated, as you need to loop all possible 
> permutation.
> 
> Ivan Kelly wrote:
>     This is ugly and shortsighted. There may well be a case in the future 
> where two flags need to be specified. What do we do then? Add a flags2 field?
> 
> Sijie Guo wrote:
>     why this couldn't be a 3rd flag? FENCE_LEDGER_AND_PIGGYBACK?
>     
>     in your repeated way, let's say you have n flags, you would have n! ways 
> to combine them. it is hard to figure out what kind of combination works for 
> what situation. as some of your flags are exclusive with others. so renaming 
> the combination specifically would make it clear.
> 
> Ivan Kelly wrote:
>     as you say, with n flags specified, there are n! combinations. so if 
> another flag as added, you will have to add n values to the enum to cover the 
> different combinations. whether the flags are exclusive is for the 
> application to decide. it shouldnt be implicit in the protocol encoding.
> 
> Sijie Guo wrote:
>     same explanation for OperationType is applying here. if n flags mostly 
> are used individually, an explicit way will just need to deal with around n 
> possibilities. but you need to write lots of if..else branches for inferring 
> in an implicit way.
> 
> Ivan Kelly wrote:
>     Your assumption here is that all the flag handling code will be in one 
> place in a big switch. this isnt the case. I dont know how you have 
> implemented piggyback, but i would guess it would be something like.
>     
>     Response processReadRequest(Request req) {
>         if (isFencing(req)) {
>             fenceLedger();
>         }
>         byte[] data = fetchEntry(req.getLedgerId(), req.getEntryId());
>         Response resp = new Response(OK, data);
>         if (isPiggyback(req)) {
>             addPiggyback(resp);
>         }
>         return resp;
>     }
>     
>     Now consider how #isPiggyback and #isFencing would be implemented with 
> enums and with individual boolean flags? which is shorter? Which is easier to 
> maintain?
> 
> Sijie Guo wrote:
>     isFencing => flag == Flag.FENCE_LEDGER
>     isPiggyback => flag == ENTRY_PIGGYBACK
>     
>     what is the difference above for maintaining?
>     
>     using the flag, the semantic is super clear that your request is either a 
> fencing request or a piggyback request. it would not happen that you received 
> a request that probably have two flags enabled, then what does it mean if the 
> requests have two boolean flags enabled, don't consider the case (that is 
> always the point that I made for OperationType and Flag here)? then which is 
> easier to maintain?
> 
> Ivan Kelly wrote:
>     my comment from yesterday doesn't wasnt saved by review board >:|
>     
>     Anyhow, what i asked was, can you guarantee that there will never *ever* 
> be a case where two flags will need to be set on a single op in the future?
> 
> Sijie Guo wrote:
>     well, I already made a comment on your case: "why this couldn't be a 3rd 
> flag? FENCE_LEDGER_AND_PIGGYBACK?", no? and this is how hedwig handles 
> subscription mode, 'create', 'attach', 'create_and_attach', no?
>     
>     and again, as I said in previous comment, if you define flags in an 
> explicit way, you would have much clearer branches on what kind of operations 
> should be executed on what flags, like below: (take an example, A, B, C, 
> A_AND_B, B_AND_A)
>     
>     switch (flag) {
>     case A:
>       a();
>       break;
>     case B:
>       b();
>       break;
>     case C:
>       c();
>       break;
>     case A_AND_B:
>       a();
>       b();
>       break;
>     case B_AND_A:
>       b();
>       // other executions between b() and a(), which is different from 
> A_AND_B case.
>       a();
>       break;
>     }
>     
>     but if you define flags in an implicit way, you need to check all 
> possible negative cases to avoid the program failed in abnormal cases (e.g. 
> bad combinations, bad order of flags). isn't that introducing much harder 
> work for maintenance? 
>
> 
> Rakesh R wrote:
>     If we have a set of flags, also consider both AND/OR conjunctions the 
> combinations may grow and thus increases the complexity of the problem. I 
> feel it doesn't matter whether you are using enum type or bool flag, both 
> will have 'chain of if conditions' either at the client side or at the server 
> side. If I understood correctly the switch case you are talking will be added 
> at the server side. Now for setting 'A_AND_B' enum type in the request, 
> client side we need to have the 'chain of if conditions' no?.
>     
>     I could see few advantages of having 'if chain' at the client side, 
> server would not worry about the conditional evaluation which may be in a 
> critical section. Secondly it would be easy to define strategies against enum 
> type and pass it over different layers without worrying about how they formed.
>     
>     If we think the problem space is small and assume we have only few set of 
> flags I prefer enum. Before reaching to a conclusion I've one question about 
> the number 'n' of flags and how much value 'n' can grow ?. Only my worry is, 
> once we set the protocol to enum type, as we all know couldn't get another 
> chance to change it later as 'n' increases.


> Now for setting 'A_AND_B' enum type in the request, client side we need to 
> have the 'chain of if conditions' no?.

why client need to have the chain? client just need to set different flag value 
to indicate different type of single read. 

>  how much value 'n' can grow?

I could tell the future. but I don't see it would grow too much. A specific 
enum value would make semantic more clear.


> On April 23, 2014, 10:22 a.m., Ivan Kelly wrote:
> > bookkeeper-server/src/main/proto/BookkeeperProtocol.proto, line 33
> > <https://reviews.apache.org/r/17895/diff/2/?file=563033#file563033line33>
> >
> >     This should certainly not be an enum. Otherwise we need to bump the 
> > protocol version each time we add an error code.
> >     
> >     Imagine the scenario where both server and client are running 4.3.0. 
> > Then the server is upgraded with 4.3.1 which a new error EPRINTERONFIRE. It 
> > sends this to the client who throws a decode error.
> 
> Sijie Guo wrote:
>     how could being not enum help this? if it is integer, client still has no 
> idea how to interpret, so it is still invalid response from 4.3.0 client. I 
> thought we reached an agreement on enum on the ticket, no?
> 
> Ivan Kelly wrote:
>     So for version and operationtype, enum is ok. These originate at the 
> client, so if the servers are always upgraded at the client, there's no 
> interoperability issues. Status codes originate at the server though, so it 
> is possible for the server to send a statuscode that is unrecognised to a 
> client. The normal way to handle this would be a "else" or "default:" to pass 
> this up to the client as a BKException.UnexpectedConditionException. If it's 
> an enum, this will throw a decode exception in the netty decoder, which is 
> harder to handle.
>     
>     To resolve this on the server side, by checking the version and only 
> sending errors valid for that version, implies two things. Firstly, every 
> error code change will require the version to be bumped and secondly, that 
> there will need to be a list maintained for which errors are valid for each 
> version. This goes against the motivation for using protobuf in the first 
> place.
> 
> Sijie Guo wrote:
>     this is the application level agreement, no? it doesn't matter that u are 
> using a protobuf protocol or using current protocol, or it also doesn't 
> matter that u are using an integer or an enum. in any case, the best way is 
> as you described, you shouldn't send new status code back to an old client, 
> as the new status code is meaningless to the old client.
> 
> Ivan Kelly wrote:
>     but how do you know its an old client? Only by bumping the version number 
> each time you add an error code. In which case you end up with a whole lot of 
> junk like "if (client.version == X) { send A } else if (client.version == Y) 
> { send B } else if (client.version ..." which is exactly what protobuf was 
> designed to avoid (see "A bit of history" on 
> https://developers.google.com/protocol-buffers/docs/overview).
> 
> Sijie Guo wrote:
>     a else or default branch would make the behavior unpredictable as an old 
> client is treating a new status code as some kind of unknown. as you said, 
> you want to treat them as UnexpectedConditionException. But what does 
> UnexpectedConditionException means? doesn't it mean the sever already breaks 
> backward compatibility, since server couldn't satisfy the old client's 
> request.
>     
>     so still, if server wants to be backward compatibility to clients, in any 
> cases, it needs to know what version of protocol that the client is speaking 
> and handle them accordingly, not just let client to do their job in an 
> unexpected way.
>     
>     I don't see any elegant solutions without detecting protocol version. if 
> you have, please describe how not being enum would avoid this.
> 
> Ivan Kelly wrote:
>     the default behaviour for an unknown error code is something we already 
> use today.
>     
> https://github.com/apache/bookkeeper/blob/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java#L714
>     
>     The client only needs to know that the request failed. the point of the 
> different error codes is so that the client could take specific recovery 
> steps. the default behaviour is just to pass the error up.
> 
> Sijie Guo wrote:
>     the default behavior was there just for all already known status codes. 
> but it doesn't mean it is correct for any unknown status codes. and when u 
> are saying 'the client only needs to know that the request failed', you are 
> making an assumption that there is only one status code indicating OK, other 
> status code should be taken as failed. but it isn't true. say in an old 
> protocol, we supported range reads, it responded with OK, list of entry 
> response (0 = <data>, 1 = missing, 2 = missing, 3 = <data>). if we are going 
> to improve our protocol to make communication more efficient, we are going to 
> change the protocol to get rid of transferring missing entries: responding 
> with PARTIAL_OK, list of existing entries (0 = <data>, 3 = <data>). 
>     
>     in this case, if server doesn't distinguish the client's protocol, just 
> respond every range reads with PARTIAL_OK, would did break the compatibility 
> with old protocol, as old protocol treats it as failure by default behavior. 
> in order to maintain backward compatibility, server needs to detect the 
> client's protocol and responds accordingly. as what I said, for backward 
> compatibility, a protocol doesn't really help if it involves behavior in 
> application agreement. and a default behavior is not exact right.
>     
>     for me, using protocol. it means that it is easy for us to add new 
> requests type, add new fields in existing requests. besides that, we need to 
> keep the backward compatibility in our level.
> 
> Ivan Kelly wrote:
>     when i say that 'the client only needs to know that the request failed', 
> it means that the client only needs to recognise the statuses which indicate 
> success. for any request, the status codes which indicate success should 
> never change. For a read request or add request, only OK is SUCCESS. For 
> range read,  only PARTIAL_OK and OK are success. But the client will never 
> send a range read request unless it already knows about PARTIAL_OK.
> 
> Sijie Guo wrote:
>     > “But the client will never send a range read request unless it already 
> knows about PARTIAL_OK.” 
>     
>     the example that I made is that the range read introduced with OK, but if 
> it is going to evolved to have PARTIAL_OK. then that will be an problem.
> 
> Ivan Kelly wrote:
>     Partial_ok doesn't make sense if you're using the concept of packet 
> status and operation status (as mentioned below).
> 
> Sijie Guo wrote:
>     well, if u are talking about a specific case (e.g. range read), I might 
> agree with u. but if you read my comments, it is just the case that I 
> explained how the protocol might be evolved. And my point is and always will 
> be for backward compatibility, the server shouldn't send any unknown 
> codes/fields back to clients that speaks old protocol, this is part of 
> application-level agreement, no matter using enum or integer.
> 
> Rakesh R wrote:
>     Using enum is more readable and would be good for newcomers too. Assume a 
> case where we are introducing new EC for an existing request type at the side 
> server side, EC#NEW_ERR_CODE in 4.3.1 and client is old one 4.3.0. Ideally in 
> this case, bk client should say its unknown code.
>     
>     Does this create any parsing issues at the protobuf ?. If not, I'm OK 
> using enum for the StatusCode too.
> 
> Sijie Guo wrote:
>     if you read the conversation about this, my point is that for correct 
> backward compatibility, we should avoid sending unknown error codes back to 
> an old client.
> 
> Rakesh R wrote:
>     So server should have version specific checks and generates the error 
> codes, isn't it ?.
> 
> Sijie Guo wrote:
>     yes. exact as what I pointed. sending responses matching exact version of 
> protocol, isn't that exactly for application level backward compatible? why 
> this would be the concern?
> 
> Rakesh R wrote:
>     There is no big concern here. I'm trying to understand the current 
> approach and the future path to avoid any compatibility issues later.
> 
> Sijie Guo wrote:
>     for backward compatibility, it is comprised of wire compatibility and 
> application-semantic compatibility. using protobuf, we could easily achieve 
> wire compatibility by adding new optional fields. for application-semantic, 
> the server needs to detect what version of the client is talking, and send 
> correct version of responses according to its version. the version field in 
> the request/response structure helps achieving this.
> 
> Rakesh R wrote:
>     Maintaining version list has its own complexities, but again it depends 
> on the client side requirements. As I know protobuf parser would not get 
> decode exceptions if any unknown enum code comes from server, instead this 
> will return a null value. If this is the behavior of protobuf, then today its 
> not a blocker for this patch. Because in your patch, you have already handled 
> unknown error code in PerChannelBookieClient#statusCodeToExceptionCode() and 
> treats this code as an op failure.

version list is more for application level semantic backward compatibility. a 
clear specific version for backward compatibility would be more clear than any 
implicit solution. although it might introduce code, it is clear for 
maintenance.


- Sijie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17895/#review41130
-----------------------------------------------------------


On April 24, 2014, 7:43 a.m., Sijie Guo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17895/
> -----------------------------------------------------------
> 
> (Updated April 24, 2014, 7:43 a.m.)
> 
> 
> Review request for bookkeeper and Ivan Kelly.
> 
> 
> Bugs: BOOKKEEPER-582
>     https://issues.apache.org/jira/browse/BOOKKEEPER-582
> 
> 
> Repository: bookkeeper-git
> 
> 
> Description
> -------
> 
> - introducing protobuf support for bookkeeper
> - for server: introduce packet processor / EnDecoder for different protocol 
> supports
> - for client: change PCBC to use protobuf to send requests
> - misc changes for protobuf support
> 
> (bookie server is able for backward compatibility) 
> 
> 
> Diffs
> -----
> 
>   bookkeeper-server/pom.xml ebc1198 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/IndexInMemPageMgr.java
>  56487aa 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerChecker.java
>  28e23d6 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java
>  fb36b90 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/processor/RequestProcessor.java
>  241f369 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieProtoEncoding.java
>  1154047 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieRequestHandler.java
>  b922a82 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieRequestProcessor.java
>  8155b22 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookkeeperProtocol.java
>  PRE-CREATION 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PacketProcessorBase.java
>  PRE-CREATION 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PacketProcessorBaseV3.java
>  PRE-CREATION 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java
>  a10f7d5 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/ReadEntryProcessor.java
>  PRE-CREATION 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/ReadEntryProcessorV3.java
>  PRE-CREATION 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/WriteEntryProcessor.java
>  PRE-CREATION 
>   
> bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/WriteEntryProcessorV3.java
>  PRE-CREATION 
>   bookkeeper-server/src/main/proto/BookkeeperProtocol.proto PRE-CREATION 
>   bookkeeper-server/src/main/resources/findbugsExclude.xml 97a6156 
>   
> bookkeeper-server/src/test/java/org/apache/bookkeeper/proto/TestProtoVersions.java
>  5fcc445 
>   
> bookkeeper-server/src/test/java/org/apache/bookkeeper/replication/AuditorPeriodicCheckTest.java
>  3f8496f 
>   
> bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookieClientTest.java
>  bc05229 
>   
> bookkeeper-server/src/test/java/org/apache/bookkeeper/test/TestBackwardCompat.java
>  8376b46 
>   compat-deps/bookkeeper-server-compat-4.2.0/pom.xml PRE-CREATION 
>   compat-deps/hedwig-server-compat-4.2.0/pom.xml PRE-CREATION 
>   compat-deps/pom.xml f79582d 
>   hedwig-server/pom.xml 06cf01c 
>   
> hedwig-server/src/test/java/org/apache/hedwig/server/TestBackwardCompat.java 
> 8da109e 
> 
> Diff: https://reviews.apache.org/r/17895/diff/
> 
> 
> Testing
> -------
> 
> unit tests. backward tests.
> 
> 
> Thanks,
> 
> Sijie Guo
> 
>

Re: Review Request 17895: BOOKKEEPER-582: protobuf support for bookkeeper

Reply via email to