Steve Brewin wrote: > I think there is a lot of merit in coming up with a new queueing mechanism
> we should explain the benefits any proposed change is seeking to achieve. Considering the amount of available "free time", there had better be some serious benefits, no? ;-) > might these be support for distributed operation Yes. Absolutely. > integration into other service oriented architectures? Somewhat. Possibly not arguable as a primary goal. And we may also foster the use of the Mailet API well beyond JAMES. > > Concepts: > > - Each processor is a named queue entry. Our core architecture for the mailet pipeline would be message-based, reusing well-established patterns from distributed queuing platforms, such as MQ, JMS and others. The use of a named queue is basically what we have today: each processor is named. I also assert that processor names are locally scoped references, disassociated from real resources. So if we have a distributed scenario, we may end up with something like: <processor name="my-remote-processor" class="QAlias"> <queue class="JMSQueue"> <queuefactory>jms/myQF</queuefactory> <queuename>jms/myQ</queuename> </spoolmanager> </processor> <processor name="another-remote-processor" class="QAlias"> <queue class="JDBCQueue"> <datasource>jdbc/myDS</datasource> </queue> </processor> <processor name="mq-remote-processor" class="QAlias"> <queue class="MQLink"> <queuename>mQueue</queuename> </spoolmanager> </processor> <processor name="root"> <queue jndi="queue/myRoot"> ... </queue> <mailet ...> </processor> Those are just quick and not necessarily complete examples, but basically they define local processors so that we can do a put operation locally without having to know the network topology, nor the real queue resource name, and the local processor is just there to define in one place the queue technology and address required. Notice how we can mix and match entirely different queuing technologies, since the queue manager is responsible for providing put and get operations for both the processor (consumer) and any senders. The QAlias class, in this case, wouldn't do much, but it would complain if there were a pipeline actually defined here. The final example defines a processor that is locally addressed, processes locally, but might also be remotely addressable. For that matter, any of the QAlias examples could be defining an alias to that processor, since we don't know from this textual context what context.lookup("queue/myRoot") will instantiate. And a cool thing is that in an incoming message protocol handler, we could have: <queue jndi="queue/myRoot"/> and that would define the root processor for that particular message handler. You could have separate root servers for SMTP vs SMTPS, for example. This begs the question of what to do about MailetContext.sendMail() within the pipeline, which starts at the implicit root processor. I see that as implementation-specific, and somewhat ripe for discussion. This is a bit primative, albeit not dissimilar from MQSeries. We can improve upon it, e.g., by looking up queue managers -- implementation and all -- from JNDI as shown, but I am trying to not assume that every implementation will have JNDI or JMS or JDBC pervasive throughout the system. > > - A queue entry would normally contain a JAMES Mail > > object. No real change from what we have today. Just identifying the players in the architecture. > - Each processor [defines] a transaction. This is a key concept. We are supposed to behave this way, but we have failure scenarios today because we do not have transactional behavior in JAMES. So I'm defining the transaction boundary. The processor is the transaction. Either everything completes successfully or nothing does. In the event of a failure, the get operation rolls back so that the message is available to be processed again. > - Each processor is associated with a queue manager > and, optionally, a retry schedule. This takes what we had to do in RemoteDelivery, and generalizes it. For example, what happens if [clamd | DNS | spamd] is not available? We can queue up and wait for the service to become available. Perhaps we might want to add something to allow notification (think queue events for those of you who know MQ), but the real issue is that every processor can be made more reliable. I am a bit surprised that this is an area that Stefano asked about, because one of the earliest messages from him that I recall was about wanting multiple spoolers because he wanted finer grained control over threads available to specific processors. Perhaps he is wondering why I didn't express things as: <spoolmanager> <processor> ... </processor> </spoolmanager> For one thing, the processor is more the mental focus for an administrator. But in addition, the spoolmanager, at this level of discourse, would not have multiple queues (and thus not multiple processors), unless we did something like: <spoolmanager> <processor name="myprocessor"> <queue binding="..."/> ... </processor> <processor name="anotherprocessor"> <queue binding="..."/> ... </processor> </spoolmanager> Which gets us back to what I expressed. Recalling that processors are the named targets, and therefore are what is logically attached to a queue, and that the queue manager is the entity bridging the processor and the queue, it seems to make the most sense to describe it as I have in the proposal. But that's why we post these things for discussion. And, yes, the queue manager would continue to be responsible for calling the processor to handle each message. Each queue manager would be registered with the MailetContext, which would be provided to the processor in order to allow it to put messages set to a new processor (if we keep the currently Mailet API). We might provide a suitable error or exception on the Mail.setState call if we try to address a queue (processor) that does not exist. > > - I believe that a queue implementation independent > > scheduler that provides the next time at which a > > message should be processed may be sufficient. > > Each queue entry would carry a timestamp before > > which it should not be processed. "Restarting" > > the queue would be as simple as changing that > > timestamp entry. We've often wanted a nice way to restart a message, and I've already described the use of retrying for more than just RemoteDelivery. I do feel that even though the code providing the schedule can be independent of the queue implementation, the implementation of how the query is implemented goes with the spool manager in order to facilitate optimization of that process for the underlying technology. > > - A new RETRY Mail state can be set to rollback the > > transaction and put the Mail back into the queue. > > We should decide on commit and rollback semantics. > > - The processor acquires a new attribute that explicitly > > sets the fall-through state. The default shall be the > > new RETRY state, except for messages that exhause the > > retry schedule. This is just extending the current Mailet API semantics to allow the Mailet to express the need for a scheduled retry, and it defaults to doing a RETRY instead of a GHOST if we fall off the end of a processor, which seems safer. Plus I made the fallthrough state configurable, which seems a nice little win. If we do express the operation differently, that's fine, too. > > one might implement a processor as an MDB. Actually, that is wrong. The queue manager is responsible for taking things off of the queue, and therefore the MDB would be part of that package. The processor should be independent of, and reusable with, any queue implementation. Oh, and if you really want to have some fun, consider that except where the current API does refer to Mail (as in Mailet API and Mail object), nothing in the above says anything about mail. It is just about defining queues, transaction, workflow and processoing for messages. So this is a bit more discussion of what I have in mind, and why. --- Noel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]