Re: [RT] Improved matching selecting

2006-12-10 Thread Mark Lundquist


On Dec 10, 2006, at 3:29 PM, Alfred Nathaniel wrote:

Simple enough, but only in Relax NG.  This grammar cannot be 
transformed

into a DTD.


There isn't currently a sitemap DTD, is there?

—ml—



Re: [RT] Improved matching selecting

2006-12-08 Thread Mark Lundquist


Hi Alfred,

On Dec 5, 2006, at 3:21 PM, Alfred Nathaniel wrote:


On Tue, 2006-12-05 at 07:58 -0800, Mark Lundquist wrote:

I'm not so keen on that, 'cause I'm actually trying to get away from
using 2 different elements for this.

 ^^
N.B.,  it looks like you might have misread the above as components, 
instead of elements.  I regard collapsing matching/selecting syntax 
to originate with a single type of element (viz., match) as a 
positive good and a primary goal, not a just an after-effect of any 
implementation consideration (though I am pleased that it can also be 
done with a considerable simplification and reduction of existing 
code).



1) The precedent of match and select would have conditioned users
to interpret if and choose as referring to two different kinds of
sitemap components (Iffers and Choosers? :-).  I'd like the syntax
to emphasize that it's all matching and there is only one component
now, The Matcher.


There is no need for a one-to-one relation between sitemap tags and
components.  (There won't be any Whener in your model either?).


Yes, well you know that and I know that :-)... what I was trying to say 
is that users expect a certain relationship between elements and 
components in the area of matching and selecting, because it's been 
that way in all the history of Cocoon, at least as far back as most 
people know (certainly true for me :-).  I think you missed my entire 
point of (1); I wasn't at all saying that I felt the use of a single 
matching component constrains me to use a single XML element in the 
sitemap.  Anyway, that was the least important consideration, I should 
probably have listed it last instead of first! :-)


So I don't see the problem in using Matcher implementations for more 
than one

tag which is not called map:match.


Yes, there is (technically) is no problem in doing this.  The 
difference is that you view it as desirable, while I do not! :-)



2) The sitemap language is not XSLT and has nothing to do with XSLT.
The only relationship is that the sitemap has to do with Cocoon and
Cocoon uses XSLT... big deal! :-)  Trying to imitate XSLT in the
sitemap in the interest of familiarity IMHO is misguided and results
in confusion.  Things that are different should look and feel
different.  [snip...]


Well, why not really use XSLT syntax?


errm... see above? :-)



map:if test=wcmatch(uri(), '**/*.xml'

where wcmatch() and uri() are Cocoon components.


OK, I am not keen on inventing a novel syntactic form in the sitemap, 
especially for little or no benefit.  IMHO, the introduction of a novel 
syntactic form should be considered only when necessary to deliver a 
_compelling_ benefit.  Also, implementing it with this 
pseudo-expression-language syntax is more work than I want to do, but 
that's a secondary consideration.  (It would require a greater volume 
of code, which is more important than the consideration of effort but 
still secondary to the usage complexity vs. benefit tradeoff).


I also don't like how it suggests to the user that there is some kind 
of generalized expression language available, when in fact there is not 
(nor do I think there should be)... so it turns out to just be a facade 
that exists for the sake of being able to have a _less_ meaningful 
attribute name (test, as opposed to equals, path, etc.).


I'll grant that this suggestion does, in a perverse way, make the 
attribute name test itself slightly less objectionable than it is now 
in select/when, since the contents of the attribute would then read 
as a predicate.  But I think overall, it'd be a step in the wrong 
direction.



I think we should use two different keywords because otherwise the
content model depends on the presence of various attributes and not on
the tagname only -- that is really confusing.


To my mind, a special element to distinguish conditionals having 
exactly 1 alternative is no more desirable than a special element to 
distinguish conditionals having exactly two alternatives, or three, or 
any other number; that is to say, not desirable at all.  In the scheme 
I've proposed, it's very easy (and to me, natural) to tell how many 
alternatives a matcher has.


best regards,
—ml—



Re: [RT] Improved matching selecting

2006-12-08 Thread Alfred Nathaniel
On Fri, 2006-12-08 at 14:07 -0800, Mark Lundquist wrote:
...
 
  I think we should use two different keywords because otherwise the
  content model depends on the presence of various attributes and not on
  the tagname only -- that is really confusing.
 
 To my mind, a special element to distinguish conditionals having 
 exactly 1 alternative is no more desirable than a special element to 
 distinguish conditionals having exactly two alternatives, or three, or 
 any other number; that is to say, not desirable at all.  In the scheme 
 I've proposed, it's very easy (and to me, natural) to tell how many 
 alternatives a matcher has.
 
 best regards,
 —ml—

Could you try to write a Relax NG schema sniplet (preferable in compact
notation) for validating your proposed syntax?

Cheers, Alfred.



Re: [RT] Improved matching selecting

2006-12-08 Thread Mark Lundquist


On Dec 8, 2006, at 2:39 PM, Alfred Nathaniel wrote:


Could you try to write a Relax NG schema sniplet (preferable in compact
notation) for validating your proposed syntax?


Sure, great idea... I'll take a swipe at that before I start any coding.

best,
—ml—



Re: [RT] Improved matching selecting

2006-12-08 Thread Mark Lundquist


On Dec 8, 2006, at 2:39 PM, Alfred Nathaniel wrote:


Could you try to write a Relax NG schema sniplet (preferable in compact
notation) for validating your proposed syntax?


OK, I think this is it.  Just a first swipe at it, I haven't tried 
validating anything against it, I've just run it through trang and 
that's all so far.  Not much to it, huh? :-)


Cheers,
—ml—

sitemap-matcher-engine.rnc
===


grammar {

start = Match

MatchType =
  exact
| contains
| path
| regexp

Match =
element match {
attribute value { text }
 (
Matchspec
| Alternative +
  )
}

Matchspec =
InlineMatchspec
| Branch +

InlineMatchspec =
attribute MatchType { text }

Branch =
element MatchType { text }

Alternative =
element when { Matchspec }

}

=



Re: [RT] Improved matching selecting

2006-12-05 Thread Mark Lundquist

Hi Alfred,

On Dec 4, 2006, at 12:46 PM, Alfred Nathaniel wrote:


Or use different tags, say in resemblance to XSLT:

map:if path=...
...
/map:if

map:choose
  map:when path=...


I'm not so keen on that, 'cause I'm actually trying to get away from 
using 2 different elements for this.  Rationale:


1) The precedent of match and select would have conditioned users 
to interpret if and choose as referring to two different kinds of 
sitemap components (Iffers and Choosers? :-).  I'd like the syntax 
to emphasize that it's all matching and there is only one component 
now, The Matcher.


2) The sitemap language is not XSLT and has nothing to do with XSLT.  
The only relationship is that the sitemap has to do with Cocoon and 
Cocoon uses XSLT... big deal! :-)  Trying to imitate XSLT in the 
sitemap in the interest of familiarity IMHO is misguided and results in 
confusion.  Things that are different should look and feel different.  
For example: in XSLT if and choose, the @test clause contains a 
predicate.  This is fundamentally different then in the sitemap, where 
the corresponding attribute contains a pattern, and the predicate 
comprises some kind of (implicit or configured) match of this pattern 
against a configured target value.  Now the way this is expressed in 
the classic sitemap, the select/when version puts this value into 
an attribute called test — probably, again, in deference to XSLT, and 
IMHO confusing — while the match version puts it in an attribute 
called pattern.  But in either case, the semantics are rather 
different than XSLT owing to the difference between predicate and 
pattern.


3) I think XSLT got it wrong :-).  They should have used something like 
xsl:cond for both, and treated @test like I treat @value in my 
proposal.  An analogy between xsl:choose and a switch or case 
statement is flawed, the correct analogy is to if()... else if() 
— again, because of the distinction between predicate and pattern!  
Switch/case is really like today's map:select!!!  if()... 
inaugurates a conditional using the same keyword regardless of how many 
alternatives there are — one, or many.  That's how sitemap matching 
(which has only patterns) should do it, and that's how XSLT (which has 
only predicates) should have done it.  No need for two different 
keywords.


cheers and thx for the feedback :-),
—ml—


Re: [RT] Improved matching selecting

2006-12-05 Thread Alfred Nathaniel
On Tue, 2006-12-05 at 07:58 -0800, Mark Lundquist wrote:

 On Dec 4, 2006, at 12:46 PM, Alfred Nathaniel wrote:
 
 Or use different tags, say in resemblance to XSLT:
 
 map:if path=...
 ...
 /map:if
 
 map:choose
   map:when path=...
 
 I'm not so keen on that, 'cause I'm actually trying to get away from
 using 2 different elements for this.  Rationale:
 
 1) The precedent of match and select would have conditioned users
 to interpret if and choose as referring to two different kinds of
 sitemap components (Iffers and Choosers? :-).  I'd like the syntax
 to emphasize that it's all matching and there is only one component
 now, The Matcher.

There is no need for a one-to-one relation between sitemap tags and
components.  (There won't be any Whener in your model either?).  So I
don't see the problem in using Matcher implementations for more than one
tag which is not called map:match.

 2) The sitemap language is not XSLT and has nothing to do with XSLT.
 The only relationship is that the sitemap has to do with Cocoon and
 Cocoon uses XSLT... big deal! :-)  Trying to imitate XSLT in the
 sitemap in the interest of familiarity IMHO is misguided and results
 in confusion.  Things that are different should look and feel
 different.  For example: in XSLT if and choose, the @test clause
 contains a predicate.  This is fundamentally different then in the
 sitemap, where the corresponding attribute contains a pattern, and the
 predicate comprises some kind of (implicit or configured) match of
 this pattern against a configured target value.  Now the way this is
 expressed in the classic sitemap, the select/when version puts
 this value into an attribute called test — probably, again, in
 deference to XSLT, and IMHO confusing — while the match version puts
 it in an attribute called pattern.  But in either case, the
 semantics are rather different than XSLT owing to the difference
 between predicate and pattern.

Well, why not really use XSLT syntax?

map:if test=wcmatch(uri(), '**/*.xml'

where wcmatch() and uri() are Cocoon components.

 3) I think XSLT got it wrong :-).  They should have used something
 like xsl:cond for both, and treated @test like I treat @value in
 my proposal.  An analogy between xsl:choose and a switch or case
 statement is flawed, the correct analogy is to if()... else if() —
 again, because of the distinction between predicate and pattern!
 Switch/case is really like today's map:select!!! if()...
 inaugurates a conditional using the same keyword regardless of how
 many alternatives there are — one, or many.  That's how sitemap
 matching (which has only patterns) should do it, and that's how XSLT
 (which has only predicates) should have done it.  No need for two
 different keywords.

I think we should use two different keywords because otherwise the
content model depends on the presence of various attributes and not on
the tagname only -- that is really confusing.

Whether the keyword pair is match/select or if/choose or cond/switch or
something else I don't care too much.

 cheers and thx for the feedback :-),
 —ml—

Cheers, Alfred.



Re: [RT] Improved matching selecting

2006-12-04 Thread Bertrand Delacretaz

On 12/3/06, Mark Lundquist [EMAIL PROTECTED] wrote:


...So... WDYAT?..


I like your proposal, there are probably some details to iron out but
I'd like to say: go for it!

It might be a good opportunity to try building the smallest
(embeddable?) Cocoon that can run pipelines, by implementing your
proposal and plugging a minimal set of existing components into it.

-Bertrand


Re: [RT] Improved matching selecting

2006-12-04 Thread Sylvain Wallez
Mark Lundquist wrote:
 Hi gang,

snip/

 So... WDYAT?

+1. Lots of good ideas!

I even think it may be implemented in a backwards compatible way, by
switching between the two approaches depending on the existence of a
pattern attribute, and thus go in a 2.2.x release.

Sylvain

-- 
Sylvain Wallez - http://bluxte.net



Re: [RT] Improved matching selecting

2006-12-04 Thread Mark Lundquist


On Dec 4, 2006, at 1:29 AM, Sylvain Wallez wrote:


I even think it may be implemented in a backwards compatible way, by
switching between the two approaches depending on the existence of a
pattern attribute, and thus go in a 2.2.x release.


Good idea... yes, I think that would be possible.

—ml—



Re: [RT] Improved matching selecting

2006-12-04 Thread Peter Hunsberger

On 12/4/06, Sylvain Wallez [EMAIL PROTECTED] wrote:

Mark Lundquist wrote:
 Hi gang,

snip/

 So... WDYAT?

+1. Lots of good ideas!

I even think it may be implemented in a backwards compatible way, by
switching between the two approaches depending on the existence of a
pattern attribute, and thus go in a 2.2.x release.


Yes, if that is done then  +1

--
Peter Hunsberger


Re: [RT] Improved matching selecting

2006-12-04 Thread Mark Lundquist


On Dec 4, 2006, at 7:22 AM, Peter Hunsberger wrote:


On 12/4/06, Sylvain Wallez [EMAIL PROTECTED] wrote:

Mark Lundquist wrote:
 Hi gang,

snip/

 So... WDYAT?

+1. Lots of good ideas!

I even think it may be implemented in a backwards compatible way, by
switching between the two approaches depending on the existence of a
pattern attribute, and thus go in a 2.2.x release.


Yes, if that is done then  +1


The only thing that troubles me about that approach is the unintended 
consequence... if a user is trying to use a classic matcher and they 
bungle the 'pattern' attribute, the resulting error message will be 
confusing because it will be coming from the new style matcher (or 
its node builder).  It'd be nice to figure out a graceful solution for 
this case... but if not, then I guess oh, well :-)


—ml—



Re: [RT] Improved matching selecting

2006-12-04 Thread Bertrand Delacretaz

On 12/4/06, Mark Lundquist [EMAIL PROTECTED] wrote:


.. if a user is trying to use a classic matcher and they
bungle the 'pattern' attribute, the resulting error message will be
confusing because it will be coming from the new style matcher...


You could also make things more explicit and use a per-sitemap
instruction or namespace to switch between the old and new matching
engines.

Restricting a single sitemap to use either of these engines (as
opposed to allowing a mix in a single sitemap) might help avoid
confusion.

-Bertrand


Re: [RT] Improved matching selecting

2006-12-04 Thread Mark Lundquist


On Dec 3, 2006, at 1:27 PM, Mark Lundquist wrote:

7) There would be a globally configurable property to take the place 
of the local @value attribute.  To invoke a (non-default) configured 
instance, you use match type=... just like today, but that is not 
any lighter syntactically than just using @value.  The real reason for 
this is to be able to configure a more specific default, e.g.


matchers default=uri

	  matcher name=general class=...  !-- Note: the concrete class 
is always the same! --

  /matcher

  matcher name=uri class=...
value{request:URI}/value
  /matcher
  matcher class=...

/matchers

This allows you to just write

match path=foobar/**/*

which is nearly identical to today's match pattern=foobar/**/* 
where matchers default=wildcard.  The tradeoff is that you then 
have to use match type=general for anything else... but you have 
the choice if you want to do it that way.


Bah, I still had some old skool thinking goin' on there myself...

Instead of a value configuration element that precludes a local 
@value attribute, the element should be called default-value and it 
should just configure a default if @value is omitted.   You can still 
override the default with @value.  So:


match path=** / !-- perfectly good... --

	match value={request-param:foo} equals=true /	!-- also 
perfectly good, no need for @type or a different matcher instance --


—ml—



Re: [RT] Improved matching selecting

2006-12-04 Thread Mark Lundquist


On Dec 4, 2006, at 7:37 AM, Bertrand Delacretaz wrote:


Restricting a single sitemap to use either of these engines (as
opposed to allowing a mix in a single sitemap) might help avoid
confusion.


I totally agree!

How about...

sitemap matching=classic

This would be the default setting, to preserve backward-compatibility.  
It would also log a message to the deprecation log!


vs.:

sitemap matching=engine !-- out w/ the old, in w/ the new 
:-) --

I'm also thinking that with matching=engine, we should get rid of the 
matchers section in the sitemap altogether, and set up the matching 
engine component in cocoon.xconf.  WDYT?


cheers,
—ml—



Re: [RT] Improved matching selecting

2006-12-04 Thread Alfred Nathaniel
On Mon, 2006-12-04 at 08:34 -0800, Mark Lundquist wrote:
 On Dec 4, 2006, at 7:37 AM, Bertrand Delacretaz wrote:
 
  Restricting a single sitemap to use either of these engines (as
  opposed to allowing a mix in a single sitemap) might help avoid
  confusion.
 
 I totally agree!
 
 How about...
 
   sitemap matching=classic
 
 This would be the default setting, to preserve backward-compatibility.  
 It would also log a message to the deprecation log!
 
 vs.:
 
   sitemap matching=engine !-- out w/ the old, in w/ the 
 new :-) --
 
 I'm also thinking that with matching=engine, we should get rid of the 
 matchers section in the sitemap altogether, and set up the matching 
 engine component in cocoon.xconf.  WDYT?
 
 cheers,
 —ml—

Or use different tags, say in resemblance to XSLT:

map:if path=...
...
/map:if

map:choose
  map:when path=...
...
  /map:when
  map:otherwise
  /map:otherwise
/map:choose

map:match and map:select can the be deprecated and removed in a future
version.

Cheers, Alfred.



[RT] Improved matching selecting

2006-12-03 Thread Mark Lundquist

Hi gang,

OK, here is a proposal for a way to reformulate matching and selecting 
so that they can be expressed in a more concise and powerful way.  It 
would (a) make sitemaps cleaner and easier to understand and write, (b) 
lower the learning curve for new users, and (c) can be implemented with 
a lot fewer classes than what we have today, allowing us to lose a 
bunch of source code.


ISSUES w/ current matching  selecting
--

1) The semantics of matching are just a special case of selecting, and 
match is just syntactic sugar for a select with a single 
alternative.  So we have a whole bunch of classes — those in 
o.a.c.matching — that exist only to support this syntactic sugar.


2) There are 3 styles of match used in the matchers and selectors: 
literal, wildcard, and RE.  So, now we have a combinatoric explosion of


(match targets) x (3 match styles) x (matching vs. selecting).

with a class required to implement each combination.

3) Coverage of that matrix is incomplete.  For example, there is no 
selector for the request URI.  You could use a SimpleSelector to match 
{request:URI}, but that only supports the literal style of matching; 
if you need the wildcard or RE style, you're out of luck — there are no 
wildcard selectors at all, and RE selection is provided for some 
targets (e.g., request parameter) but not all.


4) Against this proliferation of Matcher and Selector classes, 
documentation is incomplete.  The Clever User knows to look in the 
source tree to find out what all components are actually available.  
That's not a good situation.


5) We have the o.a.c.matching.modular.* matchers.  In concept, they 
represent a better way, but (a) they're undocumented; (b) as it is, 
they just add to the big pile, and (c) their sitemap configuration 
syntax is cumbersome.


6) The bulk of the core Cocoon docs date from before the introduction 
of input modules.  Input modules should actually be more prominent in 
the documentation than they are.  Too many users still don't understand 
them, only finding out about them when looking for a recipe to solve 
some specific problem.  This is partly due to input modules' lack of 
primacy the docs, and it's also perpetuated by the glut of special 
matchers and selectors for (almost) everything.


7) Using different elements (match and select) and different 
components for the two forms obscures to the new user the fact that it 
really is only syntactic sugar!  That's confusing.  Matching vs. 
selecting is one more thing for us to have to explain: Selecting is 
like matching, except blah blah blah.  I know it was confusing to me 
when I was first learning.  It's confusing because the newbie has the 
(correct) intuition that it's just syntactic sugar, but then the 
existence of a whole 'nuther tree of components seems to belie that.  
And then it turns out... nope, it really is just syntactic sugar!


8) The only way to express a logical or directly is to use RE 
matching, but (a) per [3] above, RE matching is not available for all 
targets; (b) users have asked for a way to do this for wildcard 
matching (and I have wanted it too).  It's a reasonable request (REs 
are harder to read and write) and a common use case, but we've denied 
them this based on the (valid) argument that we don't want to 
complicate the wildcard grammar, so we keep wildcard simple, if you 
want 'or' branches then just use REs.  However, there is fortunately 
an easy way to provide or branches in matching/selecting that does 
not involve any change to the wildcard grammar!


GOALS
--

1) Fix all of the above! :-)

2) ...with no loss of functionality (e.g., setting Vary: response 
header when necessary)...


3) ...and no negative performance impact.


PROPOSAL (the details...)
--

1) The functionality of matching would be subsumed into selection and 
be provided by the same class, since the difference is just syntactic 
sugar.  That gets rid of roughly half the classes right there, so we 
only have to refactor once instead of twice, and it takes care of one 
axis of the combinatoric explosion, see below for the other...


2) This function, which is like today's selecting, would be called 
matching and be provided by a Matcher interface unchanged from that 
of today (but I think it would only require one implementing class).  
It would be configured and invoked in the sitemap using the match 
element.


Example time, explanations follow :-).  Here is the equivalent of what 
today would have to be specially written in Java as 
RegexpURISelector:


match value={request:URI}
when regexp=foo(bar|baz)
...
/when
when regexp=[^.]*\.blah
...
/when
otherwise
...
/otherwise
/match

3) The match target is given 

Re: [RT] Improved matching selecting

2006-12-03 Thread Mark Lundquist


I see that this would have to wait 'til C3, according to
http://cocoon.zones.apache.org/daisy/cdocs/g2/g1/g2/1181.html

—ml—


Re: [RT] Improved matching selecting

2006-12-03 Thread Niclas Hedhman
On Monday 04 December 2006 06:02, Mark Lundquist wrote:
 I see that this would have to wait 'til C3, according to
   http://cocoon.zones.apache.org/daisy/cdocs/g2/g1/g2/1181.html

And in a future C3, could it be possible that the entire sitemap is abstracted 
out, 
 a. into a standalone component, 
 b. with a stricter programmatic interface as its primary API, 

both for re-use in other projects as well as easier changes in the future?

Anyway, I like your effort...


Cheers
Niclas