On 6/18/23 05:38, ToddAndMargo via perl6-users wrote:
Hi All,
I know how to do this with several regex's and words.
What I'd like to learn is how to remove something
from the middle with regex using a wild card.
And I can't figure it out
#!/bin/raku
print "\n";
my Str $x = Q[wine-7.12-3.fc37.i686.rpm</a> 23-Jul-2022 19:11 11K <a
href="wine-7.12-3.fc37.x86_64.rpm];
print "1 [$x]\n\n";
$x~~s/ $( Q[</a>] ) * $( Q[a href="] ) / /;
print "2 [$x]\n\n";
1 [wine-7.12-3.fc37.i686.rpm</a> 23-Jul-2022 19:11 11K <a
href="wine-7.12-3.fc37.x86_64.rpm]
2 [wine-7.12-3.fc37.i686.rpm</a> 23-Jul-2022 19:11 11K <
wine-7.12-3.fc37.x86_64.rpm]
My goal is to have `2` print the following out
wine-7.12-3.fc37.i686.rpm wine-7.12-3.fc37.x86_64.rpm
Many thanks,
-T
> On 6/18/23 12:10, Joseph Brenner wrote:
References: <d183015a-d171-3352-81f1-28a86e15e...@zoho.com>
Try something like this, perhaps:
$x ~~ s:i/ ^ (.*?) '</a>' .*? '<a href="' (.*?) $ /$0 $1/;
Some explanations:
s:i
The :i modifier makes it case insensitive, so data with upper-case
html won't break things.
In general, you want to break it down into chunks, and just keep the
chunks you want.
^ begin matching at the start of the string
(.*?) match anything up to the next pattern, *and* capture it to a variable
'...' I'm using single quotes on the literal strings
$ match all the way to the end of the string.
Pinning the match with ^ and $ means a s/// will replace the entire string.
There are two captures, so they load $0 and $1, and here we're using
them in the replace string: s/.../$0 $1/
Hi Joseph,
Right under my nose! Thank you.
This is my test program:
<RegexTest.pl6>
#!/bin/raku
print "\n";
my Str $x = Q[<a
href="wike-2.0.1-1.fc38.noarch.rpm">wike-2.0.1-1.fc38.noarch.rpm</a>
27-Apr-2023 01:53 143K] ~
Q[<a
href="wine-8.6-1.fc38.i686.rpm">wine-8.6-1.fc38.i686.rpm</a>
19-Apr-2023 21:48 11K] ~
Q[<a
href="wine-8.6-1.fc38.x86_64.rpm">wine-8.6-1.fc38.x86_64.rpm</a>
19-Apr-2023 21:48 11K] ~
Q[<a
href="wine-alsa-8.6-1.fc38.i686.rpm">wine-alsa-8.6-1.fc38.i686.rpm</a>
19-Apr-2023 21:48 223K];
$x~~m:i/ .*? ("wine") (.*?) $(Q[">] ) .*? $( Q[a href="] ) (.*?)
( $(Q[">] ) ) /;
print "0 = <$0>\n1 = <$1>\n2 = <$2>\n\n";
my Str $y = $0 ~ $1 ~ " " ~ $2;
print "$y\n\n";
</RegexTest.pl6>
$ RegexTest.pl6
0 = <wine>
1 = <-8.6-1.fc38.i686.rpm>
2 = <wine-8.6-1.fc38.x86_64.rpm>
wine-8.6-1.fc38.i686.rpm wine-8.6-1.fc38.x86_64.rpm
An aside: /$0$1 $2/ did not work in (so I switched to "m")
$x~~s:i/ .*? ("wine") (.*?) $(Q[">] ) .*? $( Q[a href="] ) (.*?)
( $(Q[">] ) ) /$0$1 $2/;
The result is
wine-8.6-1.fc38.i686.rpm
wine-8.6-1.fc38.x86_64.rpmwine-8.6-1.fc38.x86_64.rpm</a>
19-Apr-2023 21:48 11K<a
href="wine-alsa-8.6-1.fc38.i686.rpm">wine-alsa-8.6-1.fc38.i686.rpm</a>
19-Apr-2023 21:48 223K
Is this a bug or did I write it wrong?
-T