On Monday, Jul 14, 2003, at 09:37 US/Pacific, Mike Blezien wrote:
Hello,
We have a fairly simple redirect script a url is entered, and even tho there are directions to not enter the "http://www" sometimes we get it or "http://"... what is the simplest method to extract just the domain name if a "http://www.somedomain_name.com" or "http://somedomain_name.com" is enter so we can extract just the "somedomain_name.com"
The simplest of re's I can think of is
#------------------------
#
sub simple_get_domain
{
my ($string) = @_;
my $some_dom;
if ( $string =~ m!http://www.(.*)! ) {
$some_dom = $1;
}elsif ( $string =~ m!http://(.*)! ) {
$some_dom = $1;
}
$some_dom;
} # end of simple_get_domainbut this is merely a string parser - and is not really gonna make sure that the $some_dom returned is kosher...
The squirelly part of course are things like:
http://foo.bar.com:12345/uri_path_stuff_here or
http://127.0.0.1/
I start that part with something like
sub parse_url
{
my ($me,$url) = @_;
my($schema, $host_port, $uri ) = ($url =~ m!^([^:]+)://([^/]+)(.*)$!);
} # end of parse_url sub get_host_port
{
my ($me,$host_port) = @_; my ($host,$port) = ($host_port =~ m/([^:]+):([^:]+)/) ?
($1,$2) : ($host_port, '');
}And then the validation part gets into either working backwards from the TLD - top level domain - or trying to figure out which part of the string is the host part..... try to remember that
nas.nasa.gov
IS the domain name, not a host and domain component....
ciao drieux
---
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
