David Moreno wrote:
> It's a nice, a bit complex one. I'd try it as:
>
> $url =~ m!\A(http://)?(.+?)(/.*)?\z!;
> print $1 if $1;
> print $2;
>
> TIMTOWTDI.
>
Be a little more flexible ( inner non-capturing parens ( "(?: ...)" )
add https and, if needed "ftp" or "ldap" or ... and "/i" for case
insensitive) and always test, not assume a match. And if you know your
separator/marker (the slash) use that rather than 'dot':
if ( $url =~ m!\A((?:http|https)://)?([^/]+)!i ) {
print $1 if $1;
print $2;
} # else 'no URL'
Don't need to match (or not) the stuff after the end of the 'not slash'
part, as you don't care about it ... though you may need to 'chomp' $url
first (or deal w/ the "\n" if it's there - depends upon your loop). If
you're serious, though, there are a number of modules for this URL
finding that'll do it right for nearly everything legit - it's harder
than you'd think. J. Freidl's ("Mastering Regular Expressions" O'Reilly
http://www.oreilly.com/catalog/regex3/index.html
http://regex.info/
) URL matching masterpiece runs to 9 embedded REs and yikes but here's a
simpler one:
if ($url =~ m{^https?://([^/:]+)(:(\d+))?(/.*)?$}i)
{
my $host = $1;
my $port = $3 || 80; #/ Use $3 if it exists; otherwise default to 80./
my $path = $4 || "/"; #/ Use $4 if it exists; otherwise default to "/"./
print "Host: $host\n";
print "Port: $port\n";
print "Path: $path\n";
} else {
print "Not an HTTP URL\n";
}
--
Andy Bach, Sys. Mangler
Internet: [EMAIL PROTECTED]
VOICE: (608) 261-5738 FAX 264-5932
The only function of economic forecasting is
to make astrology look respectable.
- John Kenneth Galbraith
_______________________________________________
ActivePerl mailing list
[email protected]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs