Re: Flow wishlist :)

Stefano Mazzocchi Mon, 02 Dec 2002 15:55:55 -0800

Hunsberger, Peter wrote:

Let me be clear that I'm not looking for dynamic pipeline generation.

Cool.

The
mapping of URI to generator is well defined for everything that what we want
to do. The selection of transformer is a little less so; for example, the 1
result vs. multiple results example we talked about earlier. That's still a
static sitemap, but the transformer is chosen at run time; so the
understanding of dynamic vs. static sitemap should be clear: dynamic sitemap
would mean building the sitemap every time it's run. I can't imagine anyone
that would want that?

The ability to attach new components to pipelines at runtime has been asked for in the past and I've always been against this (and I still am).

Basically, we've got a requirement for a rules based evaluation of context data. I don't want to code this in the sitemap language and I don't want to hard code it in Java, I really want dynamic rules evaluation.

Look: I really don't get what you mean by this. Sorry, I'm slow sometimes: can you show me an *explicit* example of your functional needs? otherwise I don't feel I can be much helpful if we keep this level of abstraction without me understanding where you want to go.
I'm not sure I can explain this via e-mail much more than I have. Maybe this
needs some background in rule based systems or expert systems design; I
don't know how much you may have encountered such things?

I don't have practical experience with rule-based system, no.

Let me use an example I've given previously on the list:

sorry, I must have missed that. Thanks for taking the time to write it.

Patient privacy
rules are such that it's possible for a researcher to be doing research
using patient data and not be allowed to know the identity of the patient.
It's also possible that there is such a small patient population for a given
treatment protocol that a combination of very few searches would be needed
to uniquely identify a patient. For example, for a given protocol there
may be only a single patient born between the years 1980 and 1985 living in
Tennessee. Thus, the rules might be that we allow a search by birth date if
the user hasn't previously done a search by geography (or vice-versa). We
have to evaluate each action in the context of previous actions on the same
data. So as a researcher uses the system he builds up this trail of history
data that starts to follow him around and accumulate; they have done action
X in the distant past, action Y more recently, then action Z just now:
therefore, in the current context (of having just done Z) action Z is not
allowed or it is (because actions X and Y are still considered relevant).
This history data doesn't just come from a single source; it can come from
external systems so we really want to use some generic format to process it,
as such XML is well suited.

Ok.

In-other-words, for us, the decision on what action to take at any given
point is dynamically evaluated (like most action handlers), but the
decisions are just based on one heck of a lot of complex processing (and not
just simple form field evaluation), this processing falls into the general
CS pattern of expert systems processing and more specifically rules based
expert systems (among other things). As a result, in our particular case,
functional programming meets our needs better than most other solutions.

Cool, I trust your reasoning on this and I'm starting to understand why you want special XSLT templating where rules are stored into databases.

But I think that what you are looking for is a special transformer. It doesn't require changes to what cocoon pipelines are. At least, I don't see why it should.

We know a
functional programming model can provide this. Just so happens we have such capabilities at hand via XSLT if we can feed it rules selectors described as XML.

Sounds like the good old 'golden-hammer' antipattern to me. But I won't comment further until I understand your requirements.
Perhaps so, if it wasn't so easy to get the data into XML format we might be
looking at wiring in LISP or Haskell (or whatever) processing on the data,

true

but as it sits XSLT is an obvious way to go to meet this need. I will
observe that it's currently even harder for us to find LISP or Haskell
developers than XSLT developers (we keep looking)...

again true

In-other-words: currently, sitemap has access to context via URI, parameters, generators, etc. Based on this, sitemap spits out a decision on what transform to use.

That's one way of looking at it. Another is that your functional logic could be directly included in the generator.

Well yes, and that's sort of what I am proposing. However, let me note
that, taken to the extreme, your statement is equivalent to saying that
everything can be done with a single generator.

True. One could say that the hard thing about designing a pipeline is where to stop separating the components (this is true for any cluster-oriented paradigm like OOP, COP, AOP)

Even I'm not proposing
that, though for us I'll end up reducing the number of required generators.
What I want instead is to feed an XSLT this same set
of context as XML and have the XSLT pick the subsequent transform to use. The advantages to me are; 1) I can code in XSLT instead of sitemap language;
2) I can optimize the entire chain of events since the transform picking
XSLT can pass on the context to the next transform (standard transform
chaining); 3) I get a functional programming model (not an advantage to
some, I know).

I don't get it: you say that your requirements are so horrible that you need to keep all your rules into a database (which is a questionable sentence right there, but I don't have details to judge it). Then you say that a sitemap becomes a mess. Result: you want to write a XSLT stylesheet that uses extensions to connect to a database to obtain dynamically generated pattern-matching rules to transform an XML representation of your request context into a directly-digestible output?
The rules don't go into the database, the rule selectors go into the
database. I don't think any extensions should be needed; the context data
will be created using standard generators and possibly aggregation, though
as we proceed we're finding that our generators inherit from each other and
aggregation isn't needed; each generator picks up what is needed
automatically. Likely, for the other cases we'll end up using composition in
the generators and eliminate aggregation in the sitemap.

So, why don't you so something like

 <map:transform type="xslt" src="cocoon://..."/>

where is a cocoon pipeline that generates the stylesheet you need?

how that is goint to be any better than a sitemap+flowmap is *very* hard to see from where I stand.
Better is a relative thing.

Oh, totally.

As we've sort of concluded, in some cases your
development requirements get messy no matter what way you go.  When you're
building systems for research it's often a case of picking the less of two
evils...

Wise sentence. Still here you are pointing out that cocoon might require changes to its internal model while I really don't see this as it seems that your concerns happen to reside on a higher application level than the sitemap+flowscript (if I understood correctly, of course)

moreover, it sounds like an optimal solution to kill your webapp performances: that stylesheets becomes your bottleneck. So either you write your own xslt engine (or extend an existing one) to be able to optimize those database-extracted rules, or you're goint to have *serious* scalability issues right there.
Evaluating all these rules is going to be a performance issue no matter how
we go.  Unfortunately, the new government privacy requirements force some of
this on us (we've still got a couple of years before they all go into
effect).  Similarly, the complexity of the research environment forces some
of this on us.  That's part of the "challenge" of working in research...
In both cases, big implementation PITA.

End result? your people might know the XSLT syntax, but one thing is to know the syntax of a language, another is to be able to read a stylesheet that includes hard-core functional programming connected to external datasource via extensions.

If you think that is going to be easier for you to find people able to read/write/maintain those hard-core stylesheets than it will be to find people that can learn the sitemap syntax and a read a few lines of javascript, I think you have some thick walls to crash into in your future :)
As we've discussed this is going to be messy either way.  If it was just a
few lines of JavaScript needed to do all the complex evaluation of the
current context data then it wouldn't be an issue.  However, to code the
processing we need would mean many 1000's of lines of JavaScript.
Now-a-days we've got good XSLT editors, schema validators etc. so the  job
of creating the XSLT isn't as hard as it used to be.  Functional programming
is always an issue (sigh)...

So far this is all working, the XSLT's keep getting smaller as we generalize
things out and discover new generic processing patterns.  To me that's a key
sign that we are on the right track (in the past I've also I've seen C++
code get smaller as it gets generalized but gains more function.)

<small snip/>
However, I could also see
how there might be situations where the serialization decision might be part of the new thingy, and thus the blocks discussion and how to hand off service calls becomes relevant.

It is *NOT* a transformer decision to drive the serialization process. It's against both SoC and IoC! There is nothing planned for Cocoon Blocks that will allow this to happen and as soon as I have to vote around here, you'll get my -1 on anything that makes possible for one pipeline component to modify dynamically the pipeline execution, including choosing a serializer.
I'm not asking for the transformer to drive the serialization decision; we
definitely want to separate those decisions!  What I was saying is that if
you have a generalized way of extending the sitemap then the decision on
where to plug in the serialization becomes an issue.

I would be against a generalized way of extending the sitemap. I want people to build consensus on this list not route around it with pluggable extensions.

This is more a community-thing that a technical issue, but look at the mess that Avalon became after allowing people to diverge without creating consensus :/

It's the resources
discussion: a resource might do generation and transformation, or it might
do transformation and serialization, or whatever. In my case, the question
of whether blocks will allow this doesn't matter a whole lot, since for the
most part I think we can behave mostly as a pure transformer. However, I
could possibly see a case where I want our new thingy to behave more like a
generic resource and take over more of the otherwise standard sitemap
processing.

SOC shouldn't mean that the only place you can separate the transformation
and serialization decisions is in the master sitemap, some other
component/block might also have a good way of separating these decisions and
handling them...

Great, but let's try not to mix concerns: you started saying that you have a 'different approach' to resource production than sitemap+flowscript using XSLT.

I still fail to see how.

What you are presenting above is a very complex way to transform your data. I don't see what you can manage flow with XSLT.

Don't get me wrong, I'm not criticizing, I'm trying to understand if your functional requirements are something that the S+F (sitemap+flowscript) cannot cope with.

And even the most complex XSLT-based transformation stage is something that S+F are perfectly capable of doing (or, at least, I failed to see a reason why not)

Anyway, I really don't see how you are going to do flow description with XSLT, do you have a code snippet to show your point? that would be helpful.

--
Stefano Mazzocchi <[EMAIL PROTECTED]>
--------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Re: Flow wishlist :)

Reply via email to