>>>>> "AM" == Alexey Mishustin <shum...@shumkar.ru> writes:

  AM> /(www.){0,1}(google\.).*\/(imgres)|(images)|(products)\?{0,1}/

{0,1} is just ? by itself.

you don't need to grab things that are not used later on. also why grab
each trailing word separately which means it will be hard to tell what
word was there.

the . after www needs to be escaped (it is unlikely ever to be other
than a real dot, but it is good practice and correct to escape it).

using alternate delimiters means you don't need to escape / which makes
it easier to read.

finally, when the regex gets this complex, use the /x modifier and
comment the parts (untested):

        m{
          (www\.)?      # optional leading www
          google\.      # must have google.
          .*?           # skip some text MINIMALLY
          /             # required slash
          (imgres|images|products)      # grab the following token (is
                                        # it needed?)
          \??           # optional url arg separator ?
          /             # another required slash
        }x

uri

-- 
Uri Guttman  ------  u...@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to