Devs,

We need to be doing a better job with implementing new features in a modular 
and plugable way. We have had discussions in the past but we haven’t been the 
best at sticking to it. Most recently we began work on a modified version of 
WAN that supports transactional batching. Rather than implement it in a 
plugable model we modified the existing implementation with a new property that 
changes the internal behavior of the implementation. This is a clear smell that 
what we have should be plugable and modularized. I am not suggesting that we 
run out and define clear public SPIs for everything or come up with some 
complicated plan to re-architect everything into modules but rather that we 
take small steps. I will argue that when we are adding functionality to a core 
service that is the point we should consider steps to break it up into clear 
module components. Think to yourself, what would it take for me to implement 
this new functionality as its own module, meaning its own jar and Gradle 
sub-project. As you begin to develop the solution you may find you need some 
clean interfaces for it to extend from the core, that might be the start of an 
internal SPI. You may find that some concrete classes you would have normally 
modified just need to be extended with a few protected methods to implement 
alternative logic.

I think we should focus efforts on extracting an interface to plug in different 
WAN gateway implementations so that existing implementations aren’t modified 
with new behavior. With proper interface extraction we can more easily unit 
test around WAN gateways. By keeping implementations small we can more easily 
test them in isolation. Making them all plugable allows distributions of Geode 
to deliver specific implementations they would like to support without 
impacting the existing implementations. It also frees Geode to release new 
experimental or beta implementations of WAN gateways without impacting the 
existing implementations rather than delaying releases waiting for modified WAN 
gateways to be production ready and fully tested. 

In looking at the geode-wan module one might notice that it was already 
designed to be plugable. Unfortunately it isn’t that easy. This SPI was 
originally there to provide a way for Geode to be donated initially without WAN 
support. It turns out that most of the core to WAN is actually still in 
geode-core and only some of the “secret sauce” was moved into geode-wan. The 
bulk of the geode-core code for WAN is actually in support of the region queues 
for WAN and AEQ, so moving it into geod-wan would have cut off AEQ. While it 
would be possible to utilize this SPI for providing alternative gateway 
implementations it is very high level,so much of the alternative 
implementations would be duplications, and it doesn’t allow for both 
implementations to sit side by side at runtime. I would actually suggest we 
eliminate this public SPI in favor of just the geode-wan core module that it is 
and eventually migrate the region queue code into its own module as well, but 
these are for another day.

Looking closer at the WAN gateways themselves there is mostly a pluggable 
interface already there with the existing interfaces. I spent a little time 
pulling apart an internal SPI and it was quite easy. With a small modification 
to gfsh and cache xml to specify that alternative implementation by name is all 
that needs to be done immediately to configure an alternative. Without 
extracting too many of the common implementation out into its own module just a 
few of the classes in geode-core can be modified to provide empty 
implementation of key overridable plug-in points for the TX batching 
implementation. The result is that the TX batching module can be added to the 
classpath and be configured as a gateway sender without injecting any of the TX 
batching specific code into geode-core or geode-wan. Deeper details can be 
found at the end of this discussion.

This solution isn’t perfect but it is a step in the right direction. The more 
we continue to extract and extend the closer we will get to a true pluggable 
architecture. The key to being successful will be to keep the changes to the 
core components small and the interfaces between modules internal. If we all 
make an effort towards this, both in our code and in our reviews of others 
code, we can keep the efforts small and manageable.

I would love feedback and thoughts on this suggestion.

-Jake


Details on the POC

Before you dig in at the links below to the whole deal you should probably read 
the narrative below. Since this was a POC the exploration into already modified 
code, it took a few different turns, which may make the diff hard to read as a 
whole. Nonetheless here are the links.
All the POC work can be found at: 
https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi
A massive diff can be viewed at: 
https://github.com/apache/geode/compare/develop...pivotal-jbarrett:wip/wan-spi

https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi/geode-wan-spi/src/main/java/org/apache/geode/cache/wan/internal/spi
The key component to all this is the internal SPI, which makes this thing 
discoverable and the gateway senders constructable. The main SPI is the 
GatwaySenderTypeFactory service interface. Implementations are all discovered 
using Java’s service loader pattern. The implementation is responsible for 
constructing the alternative GatewaySender implementations. The GatewaySender 
interface just extends the InternalGatewaySender interface right now. The goal 
would be to eventually flip that relationship, but remember this is baby steps 
and a POC.

https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi/geode-wan-default/src/main
As an example, the current default implementation is extracted out into 
geode-wan-default, which is discovered just the same as an alternative 
implementation, via 
MATA-INF/services/org.apache.geode.cache.wan.internal.spi.GatewaySenderTypeFactory.

https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi/geode-wan-txbatch/src
The TX batch, for lack of a better short name, is in geode-wan-txbatch. It 
largely extends the implementations of the geode-wan-default and geode-core, 
though those commonalities could be extracted into a common module later. To 
extend the defaults though a few protected methods were added to some of the 
classes to allow for extended behavior.

https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi/geode-wan-txbatch/src/main/java/org/apache/geode/cache/wan/internal/txbatch/parallel
Focusing on the TxBatchParallelGatewaySender, the implementation extends the 
default ParallelGatwaySenderImpl and overrides the new createEventProcessor 
method to inject its new 
TxBatchRemoteConcurrentParallelGatewaySenderEventProcessor. The 
TxBatchRemoteConcurrentParallelGatewaySenderEventProcessor extends the default 
RemoteConcurrentParallelGatewaySenderEventProcessor and overrides the new 
createProcessors method to inject its new 
TxBatchRemoteParallelGatewaySenderEventProcessor. The 
TxBatchRemoteParallelGatewaySenderEventProcessor extends the default 
RemoteParallelGatewaySenderEventProcessor and extends the new 
createParallelGatewaySenderQueue and createGatewaySenderEvent methods. The 
createGatewaySenderEvent method allows this implementation to add metadata 
specific to TX batching to the GatewaySenderEventImpl, like the id and last 
event flags. The createParallelGatewaySenderQueue method allows it to inject 
its new TxBatchParallelGatewaySenderQueue implementation. The 
TxBatchParallelGatewaySenderQueue extends the default 
ParallelGatewaySenderQueue but overrides the new postProcessBatch method. The 
postProcessBatch is used to group the batches by transaction, which was the 
core change made originally in geode-core’s ParallelGatewaySenderQueue class. 

The gfsh and cache xml configuration changes are largely the same. It 
deprecates the original implementations event batching flag, as well as the 
parallel flag, in favor of a new type property. The type can either be the 
fully qualified class name or a well known short name for the implementations 
discovered through the service loader. For example, where you would previously 
have configured a parallel gateway sender as “create gateway-sender … 
--parallel …” you would know execute “create gateway-sender … 
--type=ParallelGatewaySender …”. For TX batching in parallel you would specify 
“create gateway-sender … --type=TxBatchParallelGatewaySender …”. We could 
consider more friendly short names too, like simple “serial” and “parallel” for 
default implementations and “tx-batch-parallel” and “tx-batch-serial” for the 
TX batching versions. 

If anyone wants a more detailed live walkthrough I would be happy to arrange 
one or two. 

Reply via email to