On 6/19/23 03:03, ToddAndMargo via perl6-users wrote:
On 6/18/23 05:38, ToddAndMargo via perl6-users wrote:
Hi All,
I know how to do this with several regex's and words.
What I'd like to learn is how to remove something
from the middle with regex using a wild card.
And I can't figure it out
#!/bin/raku
print "\n";
my Str $x = Q[wine-7.12-3.fc37.i686.rpm</a> 23-Jul-2022 19:11
11K <a href="wine-7.12-3.fc37.x86_64.rpm];
print "1 [$x]\n\n";
$x~~s/ $( Q[</a>] ) * $( Q[a href="] ) / /;
print "2 [$x]\n\n";
1 [wine-7.12-3.fc37.i686.rpm</a> 23-Jul-2022 19:11 11K <a
href="wine-7.12-3.fc37.x86_64.rpm]
2 [wine-7.12-3.fc37.i686.rpm</a> 23-Jul-2022 19:11 11K <
wine-7.12-3.fc37.x86_64.rpm]
My goal is to have `2` print the following out
wine-7.12-3.fc37.i686.rpm wine-7.12-3.fc37.x86_64.rpm
Many thanks,
-T
> On 6/18/23 12:10, Joseph Brenner wrote:
References: <d183015a-d171-3352-81f1-28a86e15e...@zoho.com>
Try something like this, perhaps:
$x ~~ s:i/ ^ (.*?) '</a>' .*? '<a href="' (.*?) $ /$0 $1/;
Some explanations:
s:i
The :i modifier makes it case insensitive, so data with upper-case
html won't break things.
In general, you want to break it down into chunks, and just keep the
chunks you want.
^ begin matching at the start of the string
(.*?) match anything up to the next pattern, *and* capture it to a
variable
'...' I'm using single quotes on the literal strings
$ match all the way to the end of the string.
Pinning the match with ^ and $ means a s/// will replace the entire
string.
There are two captures, so they load $0 and $1, and here we're using
them in the replace string: s/.../$0 $1/
Hi Joseph,
Right under my nose! Thank you.
This is my test program:
<RegexTest.pl6>
#!/bin/raku
print "\n";
my Str $x = Q[<a
href="wike-2.0.1-1.fc38.noarch.rpm">wike-2.0.1-1.fc38.noarch.rpm</a>
27-Apr-2023 01:53 143K] ~
Q[<a
href="wine-8.6-1.fc38.i686.rpm">wine-8.6-1.fc38.i686.rpm</a> 19-Apr-2023
21:48 11K] ~
Q[<a
href="wine-8.6-1.fc38.x86_64.rpm">wine-8.6-1.fc38.x86_64.rpm</a>
19-Apr-2023 21:48 11K] ~
Q[<a
href="wine-alsa-8.6-1.fc38.i686.rpm">wine-alsa-8.6-1.fc38.i686.rpm</a>
19-Apr-2023 21:48 223K];
$x~~m:i/ .*? ("wine") (.*?) $(Q[">] ) .*? $( Q[a href="] ) (.*?) (
$(Q[">] ) ) /;
print "0 = <$0>\n1 = <$1>\n2 = <$2>\n\n";
my Str $y = $0 ~ $1 ~ " " ~ $2;
print "$y\n\n";
</RegexTest.pl6>
$ RegexTest.pl6
0 = <wine>
1 = <-8.6-1.fc38.i686.rpm>
2 = <wine-8.6-1.fc38.x86_64.rpm>
wine-8.6-1.fc38.i686.rpm wine-8.6-1.fc38.x86_64.rpm
From my actual program:
Before Joseph's help:
$SysRev = $WebPage;
$SysRev ~~ s/ .*? $( Q[a href="wine] ) /wine/;
$SysRev ~~ s/ $( Q[x86_64.rpm] ) .* /x86_64.rpm/;
$SysRev ~~ s/ $( Q[">wine] ) / /;
$SysRev = $SysRev.words[0] ~ " " ~ $SysRev.words[6];
$SysRev ~~ s/ $( Q[href="] ) //;
$SysRev ~~ s/ $( Q[</a>] ) //;
After Joseph's help:
$SysRev = $WebPage;
$SysRev~~m:i/ .*? ("wine") (.*?) $(Q[">] ) .*? $( Q[a
href="] ) (.*?) ( $(Q[">] ) ) /;
$SysRev = $0 ~ $1 ~ " " ~ $2;
Awesome cleanup!
-T