Hi,
Sorry for the barrage of emails, but I hope to get help from the community. I hope the benefits of what I plan to contribute will be worth the efforts that you make to answer my questions. Thanks for bearing with me. :-) I wanted to ask about API boundaries. As I mentioned in a different email thread, I think that: > the organization of the Guice Modules is perhaps THE most important > abstraction available to allow people to understand the system. > > Ideally, to help provide a better understanding of the system and its > compile-time (and even to some extent its runtime) organization, I think it > is important to: > > […] > > * Ensure that each Module is well-contained (i.e. no “leaks” or coupling to > other implementations) > —> I found this part to be quite problematic I noticed what, to me, is a problem with API boundaries. I would like to use Cassandra as my example. My shallow understanding of James is that it requires a module to store emails. If I understand correctly, that is the “Maildir" module, which can be implemented in many different ways (filesystem, RDB, Cassandra, in-memory…). If my understanding is correct, then I think the concept is quite simple. If the simplicity of the concept can be maintained in the code, then the system should in principle remain easy to understand. The whole point of a framework like Guice is to separate the API from its implementations. All that James should care about is that it has a Maildir instance, and for all intents and purposes, it shouldn’t care at all about which implementation is has. That is where the resolution/writing comes into play. With Guice, this happens to be done statically at compile time in Java code (and of course executed at runtime). Nothing fancy, but that’s quite ok as it gets the job done. I like having that configuration in code because it makes things like documentation and refactoring easier. Thanks to this type of DI approach (which requires clean separation of API from implementation code) for an application assembler, it should be trivial to swap out an RDB implementation of a Maildir for a Cassandra implementation of a Maildir. Now, because James is a complex system, there are actually several APIs. For instance, by inspecting the code I gather that the EventStore also requires an implementation, and one of those implementations is Cassandra. If the API is well-designed, then it should be very easy to swap implementations: Mailbox (interface) MailboxAImpl MailboxBImpl MailboxCImpl ... EventStore EventStoreAImpl EventStoreBImpl EventStoreCImpl ... If Cassandra could implement both of these, then this configuration could be possible: Mailbox Mailbox (interface) MailboxAImpl MailboxBImpl CassandraMailbox ... EventStore EventStoreAImpl EventStoreBImpl CassandraEventStore … However, this configuration should also be possible: Mailbox Mailbox (interface) CassandraJames EventStore CassandraJames (yes! the same instance as above) In other words, the “CassandraJames” Module could very well implement more than one API. There is absolutely nothing wrong with an implementation’s implementing multiple APIs. Actually, ANY implementation, including the Cassandra implementation, does not necessarily need to reside in the same code base. It should ideally be packaged into a single JAR that gets dropped into the framework. That means that ideally there should be a single CassandraJames jar that is used to wire up the system. Perhaps all of the APIs will be used, but maybe not. Doesn’t matter. The point is to make the system easy to understand and easy to wire up. However, I could be wrong, but in the current code base, there appears to be bits of “Cassandra” code all over the place. In any case, this is just one example. The point is that I get the feeling that the Modules are not clearly defined, and the implementations are leaking into different places instead of keeping cohesive code together. Or rather, the cohesiveness is not being defined along the same dimensions, which IMO does not seem quite right. The result is that the code base does not seem to match well the high-level system concepts, which makes the code very difficult to understand even if the concepts behind James are not so difficult. So my question is: what are the thoughts behind how the APIs are designed? Is there any documentation or something that can help me understand? Right now I can only base my statements on my impressions, and my impression could be completely wrong. My impression is that the important James concepts which require a well-designed API are: * Mailbox (is this the same as “backend”??? and what about “blob”??? and “data”???) * EventStore (is this always required???) * Mailet * Server * Protocols (and the various flavours) * Admin (implemented by CLI and API) * Queue (or is this part of “server”???) These do not include the external projects, which do seem to be well defined as separate components: * Mime4j * jSieve * jSPF * jDKIM Did I miss any other important component of the system (excluding Hupa)? Thanks! =David --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
