This is not really specific to PHP (although the information might be useful
for all that form validation we all do), and for that I apologize in advance
(does anyone know of a regex mailing list?), but maybe someone here can help
with the following:

I find no good regex for checking valid domain names.  None that I have seen
take into account the fact that, although dashes ("-") and dots (".") are
allowed in a domain name, the domain name can neither begin with nor end
with a dash or dot, and additionally, two dashes or two dots in a row are
not allowed and a dash followed by a dot or a dot followed by a dash are not
allowed.

So, I've come up with two regex's for checking domain names. The first one
checks that the name contains alphanumerics, the dash and the dot, and
neither begins with or ends with a dash or dot:

   ^[a-z0-9]$|^[a-z0-9]+[a-z0-9.-]*[a-z0-9]+$

The second one checks that two dashes and two dots are not together and that
a dash followed by a dot or a dot followed by a dash are not together:

   --|\.\.|-\.|\.-

Putting it all together, the way I check for a valid domain name is with the
following:

   if (eregi("^[a-z0-9]$|^[a-z0-9]+[a-z0-9.-]*[a-z0-9]+$", $domain_name) !=
true
      OR eregi("--|\.\.|-\.|\.-", $domain_name) == true
   {
      error;
   }

So, my question (finally!) is this:

Is there any way to combine both expressions (basically, one part that
checks for false and one part that checks for true) into one regex that just
returns true or false?  I haven't been able to find any documentation that
shows me how to do that, basically a "like this but not like this" syntax.

BTW, anticipating someone mentioning the fact that the above regex's don't
check for a domain name ending with a dot followed by three characters max
(as in .com, .net, etc.), it's because that long-held truth is no longer
true.  We now have .info and .museum, and who know what the future will
bring.

About the only truth left is that domain names end in a dot followed by two
characters minimum (there are the country code domains like .us, .de. etc.
but there are no one character TLD's at present and I would expect perhaps
not for a long long time, but you never know).  Perhaps someone would expand
on the regex above to include checking for a name ending with a dot followed
by two characters minimum, I just haven't been into regex's long enough to
know how).

Of course, you could get really anal about all this and check for domain
names that only end in the current ICANN root server TLD's (about 260 or so,
I believe), but that wouldn't account for TLD's that operate within other
root servers (there's always sumthin').  Anyways,

Any help with the above is certainly appreciated!

Jeff


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to