Re: [RT] Improved matching selecting
On Dec 10, 2006, at 3:29 PM, Alfred Nathaniel wrote: Simple enough, but only in Relax NG. This grammar cannot be transformed into a DTD. There isn't currently a sitemap DTD, is there? —ml—
Re: [RT] Improved matching selecting
Hi Alfred, On Dec 5, 2006, at 3:21 PM, Alfred Nathaniel wrote: On Tue, 2006-12-05 at 07:58 -0800, Mark Lundquist wrote: I'm not so keen on that, 'cause I'm actually trying to get away from using 2 different elements for this. ^^ N.B., it looks like you might have misread the above as components, instead of elements. I regard collapsing matching/selecting syntax to originate with a single type of element (viz., match) as a positive good and a primary goal, not a just an after-effect of any implementation consideration (though I am pleased that it can also be done with a considerable simplification and reduction of existing code). 1) The precedent of match and select would have conditioned users to interpret if and choose as referring to two different kinds of sitemap components (Iffers and Choosers? :-). I'd like the syntax to emphasize that it's all matching and there is only one component now, The Matcher. There is no need for a one-to-one relation between sitemap tags and components. (There won't be any Whener in your model either?). Yes, well you know that and I know that :-)... what I was trying to say is that users expect a certain relationship between elements and components in the area of matching and selecting, because it's been that way in all the history of Cocoon, at least as far back as most people know (certainly true for me :-). I think you missed my entire point of (1); I wasn't at all saying that I felt the use of a single matching component constrains me to use a single XML element in the sitemap. Anyway, that was the least important consideration, I should probably have listed it last instead of first! :-) So I don't see the problem in using Matcher implementations for more than one tag which is not called map:match. Yes, there is (technically) is no problem in doing this. The difference is that you view it as desirable, while I do not! :-) 2) The sitemap language is not XSLT and has nothing to do with XSLT. The only relationship is that the sitemap has to do with Cocoon and Cocoon uses XSLT... big deal! :-) Trying to imitate XSLT in the sitemap in the interest of familiarity IMHO is misguided and results in confusion. Things that are different should look and feel different. [snip...] Well, why not really use XSLT syntax? errm... see above? :-) map:if test=wcmatch(uri(), '**/*.xml' where wcmatch() and uri() are Cocoon components. OK, I am not keen on inventing a novel syntactic form in the sitemap, especially for little or no benefit. IMHO, the introduction of a novel syntactic form should be considered only when necessary to deliver a _compelling_ benefit. Also, implementing it with this pseudo-expression-language syntax is more work than I want to do, but that's a secondary consideration. (It would require a greater volume of code, which is more important than the consideration of effort but still secondary to the usage complexity vs. benefit tradeoff). I also don't like how it suggests to the user that there is some kind of generalized expression language available, when in fact there is not (nor do I think there should be)... so it turns out to just be a facade that exists for the sake of being able to have a _less_ meaningful attribute name (test, as opposed to equals, path, etc.). I'll grant that this suggestion does, in a perverse way, make the attribute name test itself slightly less objectionable than it is now in select/when, since the contents of the attribute would then read as a predicate. But I think overall, it'd be a step in the wrong direction. I think we should use two different keywords because otherwise the content model depends on the presence of various attributes and not on the tagname only -- that is really confusing. To my mind, a special element to distinguish conditionals having exactly 1 alternative is no more desirable than a special element to distinguish conditionals having exactly two alternatives, or three, or any other number; that is to say, not desirable at all. In the scheme I've proposed, it's very easy (and to me, natural) to tell how many alternatives a matcher has. best regards, —ml—
Re: [RT] Improved matching selecting
On Fri, 2006-12-08 at 14:07 -0800, Mark Lundquist wrote: ... I think we should use two different keywords because otherwise the content model depends on the presence of various attributes and not on the tagname only -- that is really confusing. To my mind, a special element to distinguish conditionals having exactly 1 alternative is no more desirable than a special element to distinguish conditionals having exactly two alternatives, or three, or any other number; that is to say, not desirable at all. In the scheme I've proposed, it's very easy (and to me, natural) to tell how many alternatives a matcher has. best regards, —ml— Could you try to write a Relax NG schema sniplet (preferable in compact notation) for validating your proposed syntax? Cheers, Alfred.
Re: [RT] Improved matching selecting
On Dec 8, 2006, at 2:39 PM, Alfred Nathaniel wrote: Could you try to write a Relax NG schema sniplet (preferable in compact notation) for validating your proposed syntax? Sure, great idea... I'll take a swipe at that before I start any coding. best, —ml—
Re: [RT] Improved matching selecting
On Dec 8, 2006, at 2:39 PM, Alfred Nathaniel wrote: Could you try to write a Relax NG schema sniplet (preferable in compact notation) for validating your proposed syntax? OK, I think this is it. Just a first swipe at it, I haven't tried validating anything against it, I've just run it through trang and that's all so far. Not much to it, huh? :-) Cheers, —ml— sitemap-matcher-engine.rnc === grammar { start = Match MatchType = exact | contains | path | regexp Match = element match { attribute value { text } ( Matchspec | Alternative + ) } Matchspec = InlineMatchspec | Branch + InlineMatchspec = attribute MatchType { text } Branch = element MatchType { text } Alternative = element when { Matchspec } } =
Re: [RT] Improved matching selecting
Hi Alfred, On Dec 4, 2006, at 12:46 PM, Alfred Nathaniel wrote: Or use different tags, say in resemblance to XSLT: map:if path=... ... /map:if map:choose map:when path=... I'm not so keen on that, 'cause I'm actually trying to get away from using 2 different elements for this. Rationale: 1) The precedent of match and select would have conditioned users to interpret if and choose as referring to two different kinds of sitemap components (Iffers and Choosers? :-). I'd like the syntax to emphasize that it's all matching and there is only one component now, The Matcher. 2) The sitemap language is not XSLT and has nothing to do with XSLT. The only relationship is that the sitemap has to do with Cocoon and Cocoon uses XSLT... big deal! :-) Trying to imitate XSLT in the sitemap in the interest of familiarity IMHO is misguided and results in confusion. Things that are different should look and feel different. For example: in XSLT if and choose, the @test clause contains a predicate. This is fundamentally different then in the sitemap, where the corresponding attribute contains a pattern, and the predicate comprises some kind of (implicit or configured) match of this pattern against a configured target value. Now the way this is expressed in the classic sitemap, the select/when version puts this value into an attribute called test — probably, again, in deference to XSLT, and IMHO confusing — while the match version puts it in an attribute called pattern. But in either case, the semantics are rather different than XSLT owing to the difference between predicate and pattern. 3) I think XSLT got it wrong :-). They should have used something like xsl:cond for both, and treated @test like I treat @value in my proposal. An analogy between xsl:choose and a switch or case statement is flawed, the correct analogy is to if()... else if() — again, because of the distinction between predicate and pattern! Switch/case is really like today's map:select!!! if()... inaugurates a conditional using the same keyword regardless of how many alternatives there are — one, or many. That's how sitemap matching (which has only patterns) should do it, and that's how XSLT (which has only predicates) should have done it. No need for two different keywords. cheers and thx for the feedback :-), —ml—
Re: [RT] Improved matching selecting
On Tue, 2006-12-05 at 07:58 -0800, Mark Lundquist wrote: On Dec 4, 2006, at 12:46 PM, Alfred Nathaniel wrote: Or use different tags, say in resemblance to XSLT: map:if path=... ... /map:if map:choose map:when path=... I'm not so keen on that, 'cause I'm actually trying to get away from using 2 different elements for this. Rationale: 1) The precedent of match and select would have conditioned users to interpret if and choose as referring to two different kinds of sitemap components (Iffers and Choosers? :-). I'd like the syntax to emphasize that it's all matching and there is only one component now, The Matcher. There is no need for a one-to-one relation between sitemap tags and components. (There won't be any Whener in your model either?). So I don't see the problem in using Matcher implementations for more than one tag which is not called map:match. 2) The sitemap language is not XSLT and has nothing to do with XSLT. The only relationship is that the sitemap has to do with Cocoon and Cocoon uses XSLT... big deal! :-) Trying to imitate XSLT in the sitemap in the interest of familiarity IMHO is misguided and results in confusion. Things that are different should look and feel different. For example: in XSLT if and choose, the @test clause contains a predicate. This is fundamentally different then in the sitemap, where the corresponding attribute contains a pattern, and the predicate comprises some kind of (implicit or configured) match of this pattern against a configured target value. Now the way this is expressed in the classic sitemap, the select/when version puts this value into an attribute called test — probably, again, in deference to XSLT, and IMHO confusing — while the match version puts it in an attribute called pattern. But in either case, the semantics are rather different than XSLT owing to the difference between predicate and pattern. Well, why not really use XSLT syntax? map:if test=wcmatch(uri(), '**/*.xml' where wcmatch() and uri() are Cocoon components. 3) I think XSLT got it wrong :-). They should have used something like xsl:cond for both, and treated @test like I treat @value in my proposal. An analogy between xsl:choose and a switch or case statement is flawed, the correct analogy is to if()... else if() — again, because of the distinction between predicate and pattern! Switch/case is really like today's map:select!!! if()... inaugurates a conditional using the same keyword regardless of how many alternatives there are — one, or many. That's how sitemap matching (which has only patterns) should do it, and that's how XSLT (which has only predicates) should have done it. No need for two different keywords. I think we should use two different keywords because otherwise the content model depends on the presence of various attributes and not on the tagname only -- that is really confusing. Whether the keyword pair is match/select or if/choose or cond/switch or something else I don't care too much. cheers and thx for the feedback :-), —ml— Cheers, Alfred.
Re: [RT] Improved matching selecting
On 12/3/06, Mark Lundquist [EMAIL PROTECTED] wrote: ...So... WDYAT?.. I like your proposal, there are probably some details to iron out but I'd like to say: go for it! It might be a good opportunity to try building the smallest (embeddable?) Cocoon that can run pipelines, by implementing your proposal and plugging a minimal set of existing components into it. -Bertrand
Re: [RT] Improved matching selecting
Mark Lundquist wrote: Hi gang, snip/ So... WDYAT? +1. Lots of good ideas! I even think it may be implemented in a backwards compatible way, by switching between the two approaches depending on the existence of a pattern attribute, and thus go in a 2.2.x release. Sylvain -- Sylvain Wallez - http://bluxte.net
Re: [RT] Improved matching selecting
On Dec 4, 2006, at 1:29 AM, Sylvain Wallez wrote: I even think it may be implemented in a backwards compatible way, by switching between the two approaches depending on the existence of a pattern attribute, and thus go in a 2.2.x release. Good idea... yes, I think that would be possible. —ml—
Re: [RT] Improved matching selecting
On 12/4/06, Sylvain Wallez [EMAIL PROTECTED] wrote: Mark Lundquist wrote: Hi gang, snip/ So... WDYAT? +1. Lots of good ideas! I even think it may be implemented in a backwards compatible way, by switching between the two approaches depending on the existence of a pattern attribute, and thus go in a 2.2.x release. Yes, if that is done then +1 -- Peter Hunsberger
Re: [RT] Improved matching selecting
On Dec 4, 2006, at 7:22 AM, Peter Hunsberger wrote: On 12/4/06, Sylvain Wallez [EMAIL PROTECTED] wrote: Mark Lundquist wrote: Hi gang, snip/ So... WDYAT? +1. Lots of good ideas! I even think it may be implemented in a backwards compatible way, by switching between the two approaches depending on the existence of a pattern attribute, and thus go in a 2.2.x release. Yes, if that is done then +1 The only thing that troubles me about that approach is the unintended consequence... if a user is trying to use a classic matcher and they bungle the 'pattern' attribute, the resulting error message will be confusing because it will be coming from the new style matcher (or its node builder). It'd be nice to figure out a graceful solution for this case... but if not, then I guess oh, well :-) —ml—
Re: [RT] Improved matching selecting
On 12/4/06, Mark Lundquist [EMAIL PROTECTED] wrote: .. if a user is trying to use a classic matcher and they bungle the 'pattern' attribute, the resulting error message will be confusing because it will be coming from the new style matcher... You could also make things more explicit and use a per-sitemap instruction or namespace to switch between the old and new matching engines. Restricting a single sitemap to use either of these engines (as opposed to allowing a mix in a single sitemap) might help avoid confusion. -Bertrand
Re: [RT] Improved matching selecting
On Dec 3, 2006, at 1:27 PM, Mark Lundquist wrote: 7) There would be a globally configurable property to take the place of the local @value attribute. To invoke a (non-default) configured instance, you use match type=... just like today, but that is not any lighter syntactically than just using @value. The real reason for this is to be able to configure a more specific default, e.g. matchers default=uri matcher name=general class=... !-- Note: the concrete class is always the same! -- /matcher matcher name=uri class=... value{request:URI}/value /matcher matcher class=... /matchers This allows you to just write match path=foobar/**/* which is nearly identical to today's match pattern=foobar/**/* where matchers default=wildcard. The tradeoff is that you then have to use match type=general for anything else... but you have the choice if you want to do it that way. Bah, I still had some old skool thinking goin' on there myself... Instead of a value configuration element that precludes a local @value attribute, the element should be called default-value and it should just configure a default if @value is omitted. You can still override the default with @value. So: match path=** / !-- perfectly good... -- match value={request-param:foo} equals=true / !-- also perfectly good, no need for @type or a different matcher instance -- —ml—
Re: [RT] Improved matching selecting
On Dec 4, 2006, at 7:37 AM, Bertrand Delacretaz wrote: Restricting a single sitemap to use either of these engines (as opposed to allowing a mix in a single sitemap) might help avoid confusion. I totally agree! How about... sitemap matching=classic This would be the default setting, to preserve backward-compatibility. It would also log a message to the deprecation log! vs.: sitemap matching=engine !-- out w/ the old, in w/ the new :-) -- I'm also thinking that with matching=engine, we should get rid of the matchers section in the sitemap altogether, and set up the matching engine component in cocoon.xconf. WDYT? cheers, —ml—
Re: [RT] Improved matching selecting
On Mon, 2006-12-04 at 08:34 -0800, Mark Lundquist wrote: On Dec 4, 2006, at 7:37 AM, Bertrand Delacretaz wrote: Restricting a single sitemap to use either of these engines (as opposed to allowing a mix in a single sitemap) might help avoid confusion. I totally agree! How about... sitemap matching=classic This would be the default setting, to preserve backward-compatibility. It would also log a message to the deprecation log! vs.: sitemap matching=engine !-- out w/ the old, in w/ the new :-) -- I'm also thinking that with matching=engine, we should get rid of the matchers section in the sitemap altogether, and set up the matching engine component in cocoon.xconf. WDYT? cheers, —ml— Or use different tags, say in resemblance to XSLT: map:if path=... ... /map:if map:choose map:when path=... ... /map:when map:otherwise /map:otherwise /map:choose map:match and map:select can the be deprecated and removed in a future version. Cheers, Alfred.
[RT] Improved matching selecting
Hi gang, OK, here is a proposal for a way to reformulate matching and selecting so that they can be expressed in a more concise and powerful way. It would (a) make sitemaps cleaner and easier to understand and write, (b) lower the learning curve for new users, and (c) can be implemented with a lot fewer classes than what we have today, allowing us to lose a bunch of source code. ISSUES w/ current matching selecting -- 1) The semantics of matching are just a special case of selecting, and match is just syntactic sugar for a select with a single alternative. So we have a whole bunch of classes — those in o.a.c.matching — that exist only to support this syntactic sugar. 2) There are 3 styles of match used in the matchers and selectors: literal, wildcard, and RE. So, now we have a combinatoric explosion of (match targets) x (3 match styles) x (matching vs. selecting). with a class required to implement each combination. 3) Coverage of that matrix is incomplete. For example, there is no selector for the request URI. You could use a SimpleSelector to match {request:URI}, but that only supports the literal style of matching; if you need the wildcard or RE style, you're out of luck — there are no wildcard selectors at all, and RE selection is provided for some targets (e.g., request parameter) but not all. 4) Against this proliferation of Matcher and Selector classes, documentation is incomplete. The Clever User knows to look in the source tree to find out what all components are actually available. That's not a good situation. 5) We have the o.a.c.matching.modular.* matchers. In concept, they represent a better way, but (a) they're undocumented; (b) as it is, they just add to the big pile, and (c) their sitemap configuration syntax is cumbersome. 6) The bulk of the core Cocoon docs date from before the introduction of input modules. Input modules should actually be more prominent in the documentation than they are. Too many users still don't understand them, only finding out about them when looking for a recipe to solve some specific problem. This is partly due to input modules' lack of primacy the docs, and it's also perpetuated by the glut of special matchers and selectors for (almost) everything. 7) Using different elements (match and select) and different components for the two forms obscures to the new user the fact that it really is only syntactic sugar! That's confusing. Matching vs. selecting is one more thing for us to have to explain: Selecting is like matching, except blah blah blah. I know it was confusing to me when I was first learning. It's confusing because the newbie has the (correct) intuition that it's just syntactic sugar, but then the existence of a whole 'nuther tree of components seems to belie that. And then it turns out... nope, it really is just syntactic sugar! 8) The only way to express a logical or directly is to use RE matching, but (a) per [3] above, RE matching is not available for all targets; (b) users have asked for a way to do this for wildcard matching (and I have wanted it too). It's a reasonable request (REs are harder to read and write) and a common use case, but we've denied them this based on the (valid) argument that we don't want to complicate the wildcard grammar, so we keep wildcard simple, if you want 'or' branches then just use REs. However, there is fortunately an easy way to provide or branches in matching/selecting that does not involve any change to the wildcard grammar! GOALS -- 1) Fix all of the above! :-) 2) ...with no loss of functionality (e.g., setting Vary: response header when necessary)... 3) ...and no negative performance impact. PROPOSAL (the details...) -- 1) The functionality of matching would be subsumed into selection and be provided by the same class, since the difference is just syntactic sugar. That gets rid of roughly half the classes right there, so we only have to refactor once instead of twice, and it takes care of one axis of the combinatoric explosion, see below for the other... 2) This function, which is like today's selecting, would be called matching and be provided by a Matcher interface unchanged from that of today (but I think it would only require one implementing class). It would be configured and invoked in the sitemap using the match element. Example time, explanations follow :-). Here is the equivalent of what today would have to be specially written in Java as RegexpURISelector: match value={request:URI} when regexp=foo(bar|baz) ... /when when regexp=[^.]*\.blah ... /when otherwise ... /otherwise /match 3) The match target is given
Re: [RT] Improved matching selecting
I see that this would have to wait 'til C3, according to http://cocoon.zones.apache.org/daisy/cdocs/g2/g1/g2/1181.html —ml—
Re: [RT] Improved matching selecting
On Monday 04 December 2006 06:02, Mark Lundquist wrote: I see that this would have to wait 'til C3, according to http://cocoon.zones.apache.org/daisy/cdocs/g2/g1/g2/1181.html And in a future C3, could it be possible that the entire sitemap is abstracted out, a. into a standalone component, b. with a stricter programmatic interface as its primary API, both for re-use in other projects as well as easier changes in the future? Anyway, I like your effort... Cheers Niclas