Hello Kai, Jerry and common-dev'ers - I would like to try and get a game plan together for how we go about getting some of these larger security changes into branches that are manageable, reviewable and ultimately mergeable in a timely manner.
In order to even start this discussion, I think we need an inventory of the high level projects that are underway in parallel. We can then identify those that are at the point where patches can be used to seed a branch. This will give us some insight into how to break it into phases. Off the top of my head, I can think of the following high level efforts: 1. Pluggable Authentication and Token based SSO 2. CryptoFS for volume level encryption 3. Hive Table/Column Level Encryption (admittedly this is Hive work but it will leverage common work done in Hadoop) 4. Authorization Now, #1 and #2 above have related Jiras and a number of patches available and are therefore early contenders for branching. #1 has a draft for an initial iteration that was discussed in another thread and I will attach a pdf version of the iteration-1 proposal to this mail. I propose that we converge on an initial plan based on further discussion of the attached iteration and file a Jira to represent that iteration. We can then break down the larger patches on existing Jiras to fit into the constrained scope of the agreed upon iteration and attach them to subtasks of the iteration Jira. We can then seed a Pluggable Authentication and Token based SSO branch with those related patches from H-9392, H-9534, H-9781. Now, whether we introduce a whole central sso service in that branch is up for discussion but I personally think that it will violate the "keeping it small and manageable" goal. I am wondering whether a branch for security services would do well to decouple the consumers from a specific implementation that happens to be remote. Then within the Pluggable Authentication branch - we can concentrate on the consumer level and local implementations. I assume that the CryptoFS work is also intended to be done within the branches and we have to therefore consider how to leverage common code for things like key access for encryption/decryption and signing/verifying. This sort of thing is being introduced by H-9534 as part of the Pluggable Authentication branch in support of JWT tokens. So, we will have to think through what branches are required for Crypto in the near term. Perhaps, we can concentrate on those portions of crypto that will be of immediate benefit to iteration-1 and leave higher order CryptoFS stuff to another iteration? I don't think that we want an explosion of branches at any given time. If we can limit it to specific areas, close down on the iteration and get it merged before creating a new set of branches that would be best. Again, ease of review, test and merge is important for us. I am curious how development across related branches like these would work though. If the service work needs to leverage work from the other how do we do that easily. Can we branch a branch? Will that require both to be ready to merge at the same time? Perhaps, low-level dependencies can be duplicated for some time and then consolidated later? Anyway, specific questions: Does the proposal to start with the attached iteration-1 draft to create an iteration Jira make sense to everyone? Does anyone have specific suggestions regarding the best way for managing branches that should be decoupled but at the same time leverage common code? Any other thoughts or insight? thanks, --larry