> On May 29, 2021, at 5:57 PM, rir <rir...@comcast.net> wrote: > > > Given: > rule cmp_expression { > | <str_const> <cmp_op> <identifier> > | <num_const> <cmp_op> <identifier> > | ... > } > > What is a good, concise way to express that all the alternatives are > commutative?
I am not at all clear on what you are asking, so if none of my ideas are helpful, please consider adding more detail. 1. I don't know of a regex construct that automatically converts this: /foo bar baz/ into meaning this: /foo bar baz | baz bar foo/ . So, we do not have a convenient shortcut like: rule cmp_expression { | COMMUTATIVE( <str_const> <cmp_op> <identifier> ) | COMMUTATIVE( <num_const> <cmp_op> <identifier> ) | COMMUTATIVE( ... ) } 2. If the order of the operands does not matter (i.e "are commutative", as you said), *and* the whole set of left-operands are compatible with the whole set of right-operands *and the two sets are disjoint (i.e. if AopB is valid then so is BopA, but that doesn't mean AopA is valid, nor BopB), then I would try creating rules or tokens to extract those two sets, leaving `cmp_expression` with only two branches of alternation: rule cmp_operands_A { | <str_const> | <num_const> | ... } rule cmp_operands_B { | <identifier> | ... } rule cmp_expression { <cmp_operands_A> <cmp_op> <cmp_operands_B> | <cmp_operands_B> <cmp_op> <cmp_operands_A> } 3. If <cmp_operands_A> and <cmp_operands_B> are actually the exact same set, then the "Modified quantifier" (which I think of as "Is Separated By") will allow very concise code (after extracting the operands). https://docs.raku.org/language/regexes#Modified_quantifier:_%,_%% rule cmp_operands { # ??? token instead of rule ??? | <str_const> | <num_const> | <identifier> | ... } rule cmp_expression { <cmp_operands> ** 2 % <cmp_op> } 4. If none of that compresses the regex (maybe because not every <A> forms a valid pairing with *every* <B>), I would make each BopA variant live on the same line as its AopB cousin: rule cmp_expression { | <str_const> <cmp_op> <identifier> | <identifier> <cmp_op> <str_const> | <num_const> <cmp_op> <identifier> | <identifier> <cmp_op> <num_const> | ... } > I imagine that generally this is a useless question, which is > avoided by: > > rule cmp_expression { > <value_expression> <cmp_op> <value_expression> > } > > but here many tokens under value_expression exist but are not well > defined, nor known by me. This paragraph confuses me. I read it as a less concise version of my #3 above, but when you say "many tokens under value_expression…by me”, it sounds like you can’t/won't pursue this shortened form of the regex because you don’t actually *know* the long form of the regex yet. If so, since #1 is not available, I would do #4 until the full details of all the operands becomes clear, then try to refactor to #2 or #3. > rir -- Hope this helps, Bruce Gray (Util of PerlMonks)