Devs, We need to be doing a better job with implementing new features in a modular and plugable way. We have had discussions in the past but we haven’t been the best at sticking to it. Most recently we began work on a modified version of WAN that supports transactional batching. Rather than implement it in a plugable model we modified the existing implementation with a new property that changes the internal behavior of the implementation. This is a clear smell that what we have should be plugable and modularized. I am not suggesting that we run out and define clear public SPIs for everything or come up with some complicated plan to re-architect everything into modules but rather that we take small steps. I will argue that when we are adding functionality to a core service that is the point we should consider steps to break it up into clear module components. Think to yourself, what would it take for me to implement this new functionality as its own module, meaning its own jar and Gradle sub-project. As you begin to develop the solution you may find you need some clean interfaces for it to extend from the core, that might be the start of an internal SPI. You may find that some concrete classes you would have normally modified just need to be extended with a few protected methods to implement alternative logic.
I think we should focus efforts on extracting an interface to plug in different WAN gateway implementations so that existing implementations aren’t modified with new behavior. With proper interface extraction we can more easily unit test around WAN gateways. By keeping implementations small we can more easily test them in isolation. Making them all plugable allows distributions of Geode to deliver specific implementations they would like to support without impacting the existing implementations. It also frees Geode to release new experimental or beta implementations of WAN gateways without impacting the existing implementations rather than delaying releases waiting for modified WAN gateways to be production ready and fully tested. In looking at the geode-wan module one might notice that it was already designed to be plugable. Unfortunately it isn’t that easy. This SPI was originally there to provide a way for Geode to be donated initially without WAN support. It turns out that most of the core to WAN is actually still in geode-core and only some of the “secret sauce” was moved into geode-wan. The bulk of the geode-core code for WAN is actually in support of the region queues for WAN and AEQ, so moving it into geod-wan would have cut off AEQ. While it would be possible to utilize this SPI for providing alternative gateway implementations it is very high level,so much of the alternative implementations would be duplications, and it doesn’t allow for both implementations to sit side by side at runtime. I would actually suggest we eliminate this public SPI in favor of just the geode-wan core module that it is and eventually migrate the region queue code into its own module as well, but these are for another day. Looking closer at the WAN gateways themselves there is mostly a pluggable interface already there with the existing interfaces. I spent a little time pulling apart an internal SPI and it was quite easy. With a small modification to gfsh and cache xml to specify that alternative implementation by name is all that needs to be done immediately to configure an alternative. Without extracting too many of the common implementation out into its own module just a few of the classes in geode-core can be modified to provide empty implementation of key overridable plug-in points for the TX batching implementation. The result is that the TX batching module can be added to the classpath and be configured as a gateway sender without injecting any of the TX batching specific code into geode-core or geode-wan. Deeper details can be found at the end of this discussion. This solution isn’t perfect but it is a step in the right direction. The more we continue to extract and extend the closer we will get to a true pluggable architecture. The key to being successful will be to keep the changes to the core components small and the interfaces between modules internal. If we all make an effort towards this, both in our code and in our reviews of others code, we can keep the efforts small and manageable. I would love feedback and thoughts on this suggestion. -Jake Details on the POC Before you dig in at the links below to the whole deal you should probably read the narrative below. Since this was a POC the exploration into already modified code, it took a few different turns, which may make the diff hard to read as a whole. Nonetheless here are the links. All the POC work can be found at: https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi A massive diff can be viewed at: https://github.com/apache/geode/compare/develop...pivotal-jbarrett:wip/wan-spi https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi/geode-wan-spi/src/main/java/org/apache/geode/cache/wan/internal/spi The key component to all this is the internal SPI, which makes this thing discoverable and the gateway senders constructable. The main SPI is the GatwaySenderTypeFactory service interface. Implementations are all discovered using Java’s service loader pattern. The implementation is responsible for constructing the alternative GatewaySender implementations. The GatewaySender interface just extends the InternalGatewaySender interface right now. The goal would be to eventually flip that relationship, but remember this is baby steps and a POC. https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi/geode-wan-default/src/main As an example, the current default implementation is extracted out into geode-wan-default, which is discovered just the same as an alternative implementation, via MATA-INF/services/org.apache.geode.cache.wan.internal.spi.GatewaySenderTypeFactory. https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi/geode-wan-txbatch/src The TX batch, for lack of a better short name, is in geode-wan-txbatch. It largely extends the implementations of the geode-wan-default and geode-core, though those commonalities could be extracted into a common module later. To extend the defaults though a few protected methods were added to some of the classes to allow for extended behavior. https://github.com/pivotal-jbarrett/geode/tree/wip/wan-spi/geode-wan-txbatch/src/main/java/org/apache/geode/cache/wan/internal/txbatch/parallel Focusing on the TxBatchParallelGatewaySender, the implementation extends the default ParallelGatwaySenderImpl and overrides the new createEventProcessor method to inject its new TxBatchRemoteConcurrentParallelGatewaySenderEventProcessor. The TxBatchRemoteConcurrentParallelGatewaySenderEventProcessor extends the default RemoteConcurrentParallelGatewaySenderEventProcessor and overrides the new createProcessors method to inject its new TxBatchRemoteParallelGatewaySenderEventProcessor. The TxBatchRemoteParallelGatewaySenderEventProcessor extends the default RemoteParallelGatewaySenderEventProcessor and extends the new createParallelGatewaySenderQueue and createGatewaySenderEvent methods. The createGatewaySenderEvent method allows this implementation to add metadata specific to TX batching to the GatewaySenderEventImpl, like the id and last event flags. The createParallelGatewaySenderQueue method allows it to inject its new TxBatchParallelGatewaySenderQueue implementation. The TxBatchParallelGatewaySenderQueue extends the default ParallelGatewaySenderQueue but overrides the new postProcessBatch method. The postProcessBatch is used to group the batches by transaction, which was the core change made originally in geode-core’s ParallelGatewaySenderQueue class. The gfsh and cache xml configuration changes are largely the same. It deprecates the original implementations event batching flag, as well as the parallel flag, in favor of a new type property. The type can either be the fully qualified class name or a well known short name for the implementations discovered through the service loader. For example, where you would previously have configured a parallel gateway sender as “create gateway-sender … --parallel …” you would know execute “create gateway-sender … --type=ParallelGatewaySender …”. For TX batching in parallel you would specify “create gateway-sender … --type=TxBatchParallelGatewaySender …”. We could consider more friendly short names too, like simple “serial” and “parallel” for default implementations and “tx-batch-parallel” and “tx-batch-serial” for the TX batching versions. If anyone wants a more detailed live walkthrough I would be happy to arrange one or two.