> On May 29, 2021, at 5:57 PM, rir <rir...@comcast.net> wrote:
> 
> 
> Given:
>    rule cmp_expression {
>        | <str_const> <cmp_op> <identifier>
>        | <num_const> <cmp_op> <identifier>
>        | ...
>    }
> 
> What is a good, concise way to express that all the alternatives are
> commutative?

I am not at all clear on what you are asking, so if none of my ideas are 
helpful, please consider adding more detail.

1. I don't know of a regex construct that automatically converts this:
    /foo bar baz/
into meaning this:
    /foo bar baz | baz bar foo/
. So, we do not have a convenient shortcut like:
    rule cmp_expression {
        | COMMUTATIVE( <str_const> <cmp_op> <identifier> )
        | COMMUTATIVE( <num_const> <cmp_op> <identifier> )
        | COMMUTATIVE( ... )
    }


2. If the order of the operands does not matter (i.e "are commutative", as you 
said),
*and* the whole set of left-operands are compatible
with the whole set of right-operands 
*and the two sets are disjoint
(i.e. if AopB is valid then so is BopA,
but that doesn't mean AopA is valid, nor BopB),
then I would try creating rules or tokens to extract those two sets,
leaving `cmp_expression` with only two branches of alternation:
    rule cmp_operands_A {
        | <str_const>
        | <num_const>
        | ...
    }
    rule cmp_operands_B {
        | <identifier>
        | ...
    }
    rule cmp_expression {
          <cmp_operands_A> <cmp_op> <cmp_operands_B>
        | <cmp_operands_B> <cmp_op> <cmp_operands_A>
    }


3. If <cmp_operands_A> and <cmp_operands_B> are actually the exact same set,
then the "Modified quantifier" (which I think of as "Is Separated By")
will allow very concise code (after extracting the operands).
https://docs.raku.org/language/regexes#Modified_quantifier:_%,_%%
    rule cmp_operands { # ??? token instead of rule ???
        | <str_const>
        | <num_const>
        | <identifier>
        | ...
    }
    rule cmp_expression {
        <cmp_operands> ** 2 % <cmp_op>
    }


4. If none of that compresses the regex (maybe because not every <A> forms a 
valid pairing with *every* <B>),
I would make each BopA variant live on the same line as its AopB cousin:
    rule cmp_expression {
        | <str_const> <cmp_op> <identifier>   |   <identifier> <cmp_op> 
<str_const>
        | <num_const> <cmp_op> <identifier>   |   <identifier> <cmp_op> 
<num_const>
        | ...
    }


> I imagine that generally this is a useless question, which is
> avoided by:
> 
> rule cmp_expression {
>   <value_expression> <cmp_op> <value_expression>
> }
> 
> but here many tokens under value_expression exist but are not well
> defined, nor known by me.

This paragraph confuses me.
I read it as a less concise version of my #3 above,
but when you say "many tokens under value_expression…by me”,
it sounds like you can’t/won't pursue this shortened form of the regex
because you don’t actually *know* the long form of the regex yet.
If so, since #1 is not available, I would do #4 until the full details
of all the operands becomes clear, then try to refactor to #2 or #3.


> rir

-- 
Hope this helps,
Bruce Gray (Util of PerlMonks)


Reply via email to