Re: [chain] Pipeline implementation

Alex Karasulu Fri, 17 Sep 2004 14:13:59 -0700

Hi Kris,

On Fri, 2004-09-17 at 16:23, Kris Nuttycombe wrote:
> Hi, Alex,
> 
> There is definitely some overlap here, and I think that there's good 
> potential for collaboration. Our current implementation of the pipeline 
> can definitely benefit from the per-stage thread pool configuration that 
> you've done, since right now each one of our stages is still 
> single-threaded. Adding pooling to increase concurrency within the 
> stages is something that's been on our drawing board for a while.


Cool.  We actually don't have our own thread pool but abstracted it away
so we can use any thread pool.  For testing purposes we use the commons
sandbox thread pool.  We made this API thinking it'll be embedded within
other servers or frameworks that already have their own thread pool
implementation.  We just adapt other thread pools to our ThreadPool
interface using a wrapper.  This might be something you might want to do
too when you go multithreaded.  Helps not have to carry around runtime
depenencies to other projects. 

> The model that we've been using is that stages process data instead of 
> events, although one could certainly consider an event as a type of 

Ahh that's interesting.  We do the events and carry a payload.  One of
the benefits we get with using an EventObject derived event is a nice
event type heirarchy that can be used to filter when routing events. 
Also we can associate other peices of information with the data to
control how it is processed.  One of the things we're working on in
particular where this is coming in handy is for synchronization within
the staged pipeline.  There's much we have to do here though.  I'm
toying with implementing rendezvous points for events and using other
constructs for better processing control of entire pipelines.

> data.  Our stages have to be aware of the  pipelines in which they 
> reside because our Stage interface defines additional "exqueue(Object 
> obj)" and "exqueue(String key, Object obj)" methods which are used to 
> enqueue an object on either a subsequent stage or a keyed pipeline 
> branch, respectively.

I had the same problem which created a high degree of coupling between
stages.  Since stages were implemented in IoC frameworks sometimes there
were complaints in complex systems where cycles were introduced.  I
started using a simple pub/sub event router/hub to decouple theses
stages.  Ohhh looks like you ask about that below...

> Can you give me a little more information about how you're handling 
> routing between the stages, or where to look in the source code?

We have a service we've defined called the EventRouter along with
Subscriber's and Subscriptions.  It's like the core dependency; instead
of having every stage depend on others downstream we make each stage
dependent on the EventRouter.  Basically this forms a hub and spoke like
dependency relationship between the stages and the event
broker/router/hub whatever you like to call it.  Now we can dynamically
register new Subscriptions with it to route events to different stages. 
We use the event router to handle configuration events while tying
together the pipleline as well as for the inband processing of data
flowing through the system.

The router btw is synchronous.  So when an event is 'published' by a
stage, the same thread making the call to publish the event drives the
delivery of that event to subscribers interested in being informed.  I
went synchronous here because most inform methods are really just an
enqueue() opertion onto another stage where the event is processed
asynchrously by another thread.  So there is no point to making event
delivery synchronous.  Furthermore if we wanted delivery to be
asynchronous we could easily wrap the implementation to do that.  We can
even go further if we want and tag events so they can be delivered in
one of the two modes: synch/asynch.  This however was never really
necessary.

Oh another thing to notice in this picture is the EnqueuePredicate which
determines if an event should be enqueued.  It's in the stage package. 
There reason why I mention it here is because its another means to
effect routing.  This construct is defined here because Matt Welsh had
defined it within a set of slides for his thesis on SEDA here:

thesis & papers: 
  http://portal.acm.org/citation.cfm?id=502057
  http://www.eecs.harvard.edu/~mdw/papers/seda-sosp01.pdf

matterials from presentations:
  http://www.eecs.harvard.edu/~mdw/talks/seda-lecture-uw.pdf

We used a derivative of the event notifier pattern that is well
documented here:

http://www.dralasoft.com/products/eventbroker/whitepaper/

You can find the interface for the EventRouter service here:

http://svn.apache.org/viewcvs.cgi/incubator/directory/seda/trunk/src/java/org/apache/seda/event/EventRouter.java?rev=46055&root=Apache-SVN&view=auto

Here's the simple implementation for this here:

http://svn.apache.org/viewcvs.cgi/incubator/directory/seda/trunk/src/java/org/apache/seda/event/DefaultEventRouter.java?rev=46058&root=Apache-SVN&view=auto

You may also want to take a look at the entire package and some of the
following key classes and interfaces to get a feel for this.  

http://svn.apache.org/viewcvs.cgi/incubator/directory/seda/trunk/src/java/org/apache/seda/event/?root=Apache-SVN

Classes/Interfaces of interest:

 o Subscriber
 o Subscription
 o Filter
 o EventRouter

 o AbstractSubscriber
 o DefaultEventRouter

Other classes are specific to our application space so you might not
want to waste your time looking at em.  BTW most of this stuff is a
litteral implementation of the white paper.  

Hope we can share more knowledge to learn how to deal with this
interesting pattern together.

Cheers,
Alex

> >This pipeline stage you describe sounds very much like a SEDA stage;
> >seda stage = queue + thread pool + processing code.   Events are queued
> >from one stage to another and you can change the way routing is
> >handled.  The main difference is we're using these similar patterns to
> >build a common framework for highly concurrent inet servers.  Sounds
> >like what you're doing is more geared to generic processing piplelines.
> >
> >Kris and Craig is there any overlap here where we can possibly help each
> >other?  
> >
> >One of the biggest problems I have had to date is handling the
> >serialization of certain events.  Is this something you have a nice
> >solution for? 
> >
> >BTW Craig you spoke about getting together a SEDA package of sorts
> >extracted from the directory source.  We're actually in the process of
> >doing that.  It's a separate project all together.  It has not been
> >integrated into the site yet but the code is present here:
> >
> >http://svn.apache.org/viewcvs.cgi/incubator/directory/seda/trunk/src/java/org/apache/seda/?root=Apache-SVN
> >
> >Specifically you might want to take a look at the stage package here:
> >
> >http://svn.apache.org/viewcvs.cgi/incubator/directory/seda/trunk/src/java/org/apache/seda/stage/?root=Apache-SVN
> >
> >Alex
> >
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: [EMAIL PROTECTED]
> >For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >  
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [chain] Pipeline implementation

Reply via email to