Someone might get a kick out of this ;-). Clearly regexes are built on top of set theory, but as both Simon and Yary pointed out, my set-based code didn't return the matching string "8420" present in the target.
Example A, Eirik's code used an array to generate a character class, and then tested that character class in a regex vs the target ($_) . In Example B, all it takes is the addition of the "frugal" match indicator ("+?") to Eirik's code (and presumably to Simon's code) to give almost identical results as a set-intesection. The results aren't completely identical because the regex code (Example A, Example B) is non-symmetric. As shown below ("lorem ipsum"), when Example B (array regex) is compared to Example C (set intersection), multiple copies of character class elements present in the "target" string show up in the regex match object (Example B), while these duplicate elements are eliminated from the "symmetrical" set-intersection result (Example C): ##_A____ sub contains( Str $chars, Str $_ ) { my @arr = $chars.comb.unique; return m:g/@arr+/ } say contains("24680", "19584203").join("|"); # says 8420 say contains("19584203", "24680").join("|"); # says 24|80 say contains("Lorem ipsum dolor sit amet, consectetuer adipiscing elit.", "abcdefg").join("|"); # says a|cde|g say contains("abcdefg", "Lorem ipsum dolor sit amet, consectetuer adipiscing elit.").join("|"); # says e|d|a|e|c|ec|e|e|ad|c|g|e ##_B____ sub nonsym_intersect( Str $chars, Str $_ ) { my @arr = $chars.comb.unique; return m:g/@arr+?/ } say nonsym_intersect("24680", "19584203").join("|"); # says 8|4|2|0 say nonsym_intersect("19584203", "24680").join("|"); # says 2|4|8|0 say nonsym_intersect("Lorem ipsum dolor sit amet, consectetuer adipiscing elit.", "abcdefg").join("|"); # says a|c|d|e|g say nonsym_intersect("abcdefg", "Lorem ipsum dolor sit amet, consectetuer adipiscing elit.").join("|"); # says e|d|a|e|c|e|c|e|e|a|d|c|g|e ##_C____ sub sym_intersect(Str $a, Str $b) { my @c = $a.comb.unique; my @d = $b.comb.unique; #return (~[@c (&) @d]).^name; return ~[@c (&) @d]; } say sym_intersect("24680", "19584203").words.join("|"); # says 2|8|4|0 say sym_intersect("19584203", "24680").words.join("|"); # says 8|4|2|0 say sym_intersect("Lorem ipsum dolor sit amet, consectetuer adipiscing elit.", "abcdefg").words.join("|"); # says a|g|c|d|e say sym_intersect("abcdefg", "Lorem ipsum dolor sit amet, consectetuer adipiscing elit.").words.join("|"); # says a|d|e|g|c One caveat (above, Example C), I can't return from a set-intersection and just do a "join" on the result, as in the previous two examples. I have to break the return into ".words" and then ".join", to match the format of the previous two examples. HTH, Bill. PS Eirik, I think people might be referring to <{...}> as "pointy blocks", but I'm really not sure... . On Mon, Sep 2, 2019 at 11:25 AM Joseph Brenner <doom...@gmail.com> wrote: > > > The "implicit" alternation comes from interpolating a list (of subrules, > > see below). > > I see. And that's discussed here (had to really look for it): > > https://docs.perl6.org/language/regexes#Quoted_lists_are_LTM_matches > > At first I was looking further down in the "Regex interpolation" > section, where it's also touched on, though I kept missing it: > > > When an array variable is interpolated into a regex, the regex engine > > handles it like a | alternative of the regex elements (see the > > documentation on embedded lists, above). > > > On 9/1/19, The Sidhekin <sidhe...@gmail.com> wrote: > > On Mon, Sep 2, 2019 at 1:12 AM Joseph Brenner <doom...@gmail.com> wrote: > > > >> I was just trying to run Simon Proctor's solution, and I see it > >> working for Yary's first case, but not his more complex one with > >> problem characters like brackets included in the list of characters. > >> > >> I don't really see how to fix it, in part because I'm not that > >> clear on what it's actually doing... there's some sort of > >> implicit alternation going on? > >> > >> > >> sub contains( Str $chars, Str $_ ) { > >> m:g/<{$chars.comb}>+/ > >> }; > >> > > > > The "implicit" alternation comes from interpolating a list (of subrules, > > see below). > > > > That works for this case: > >> > >> say contains('24680', '19584203'); > >> # (「8420」) > >> > >> But on something like this it errors out: > >> > >> say contains('+\/\]\[', 'Apple ][+//e'); # says ][+// > >> > > > > … because it's trying to compile each (1-character) string as a subrule … > > > > To have the (1-character) strings used a literals, rather than compiled > > as subrules, put them in an array instead of a block wrapped in angle > > brackets: > > > > sub contains( Str $chars, Str $_ ) { > > my @arr = $chars.comb; > > m:g/@arr+/ > > } > > > > > > (… hey, is there a word for "block wrapped in angle brackets"?) > > > > > > Eirik > >