On Fri, Apr 27, 2001 at 08:32:21AM -0400, Timothy Kimball ([EMAIL PROTECTED]) spew-ed 
forth:
> 
> grep() will do it easily:
> 
> @lines = grep { ! /\.txt$|\.scc$/ } @lines;
> 
> or do it when you read the file:
> 
> @lines = grep { ! /\.txt$|\.scc$/ } <INFO>;

You are making the RE engine do a log of work there.

[root@fluffhead /root]# perl -Mre=debug -wle '$_=qq{foo.txt}; /\.(txt|foo)$/';
Compiling REx `\.(txt|foo)$'
size 14 first at 1
   1: EXACT <.>(3)
   3: OPEN1(5)
   5:   BRANCH(8)
   6:     EXACT <txt>(11)
   8:   BRANCH(11)
   9:     EXACT <foo>(11)
  11: CLOSE1(13)
  13: EOL(14)
  14: END(0)
anchored `.' at 0 (checking anchored) minlen 4 
Guessing start of match, REx `\.(txt|foo)$' against `foo.txt'...
Found anchored substr `.' at offset 3...
Guessed: match at offset 3
Matching REx `\.(txt|foo)$' against `.txt'
  Setting an EVAL scope, savestack=3
   3 <foo> <.txt>         |  1:  EXACT <.>
   4 <foo.> <txt>         |  3:  OPEN1
   4 <foo.> <txt>         |  5:  BRANCH
  Setting an EVAL scope, savestack=3
   4 <foo.> <txt>         |  6:    EXACT <txt>
   7 <foo.txt> <>         | 11:    CLOSE1
   7 <foo.txt> <>         | 13:    EOL
   7 <foo.txt> <>         | 14:    END
Match successful!
Freeing REx: `\.(txt|foo)$'

versus:

[root@fluffhead /root]# perl -Mre=debug -wle '$_=qq{foo.txt}; /\.txt$|\.foo$/' 
Compiling REx `\.txt$|\.foo$'
size 9 
   1: BRANCH(5)
   2:   EXACT <.txt>(4)
   4:   EOL(9)
   5: BRANCH(9)
   6:   EXACT <.foo>(8)
   8:   EOL(9)
   9: END(0)
minlen 4 
Matching REx `\.txt$|\.foo$' against `foo.txt'
  Setting an EVAL scope, savestack=3
   0 <> <foo.txt>         |  1:  BRANCH
  Setting an EVAL scope, savestack=3
   0 <> <foo.txt>         |  2:    EXACT <.txt>
                              failed...
   0 <> <foo.txt>         |  6:    EXACT <.foo>
                              failed...
                            failed...
  Setting an EVAL scope, savestack=3
   1 <f> <oo.txt>         |  1:  BRANCH
  Setting an EVAL scope, savestack=3
   1 <f> <oo.txt>         |  2:    EXACT <.txt>
                              failed...
   1 <f> <oo.txt>         |  6:    EXACT <.foo>
                              failed...
                            failed...
  Setting an EVAL scope, savestack=3
   2 <fo> <o.txt>         |  1:  BRANCH
  Setting an EVAL scope, savestack=3
   2 <fo> <o.txt>         |  2:    EXACT <.txt>
                              failed...
   2 <fo> <o.txt>         |  6:    EXACT <.foo>
                              failed...
                            failed...
  Setting an EVAL scope, savestack=3
   3 <foo> <.txt>         |  1:  BRANCH
  Setting an EVAL scope, savestack=3
   3 <foo> <.txt>         |  2:    EXACT <.txt>
   7 <foo.txt> <>         |  4:    EOL
   7 <foo.txt> <>         |  9:    END
Match successful!
Freeing REx: `\.txt$|\.foo$'

When you look at  /\.txt$|\.foo$/ you see what the RE engine is doing (much
easier to see if you use -Mre=debugcolor). It tries
to match in portions; foo.txt -> oo.txt -> o.txt -> .txt -> (if all failed)
foo.foo -> oo.foo -> o.foo -> .foo
The first one immediately sees the anchor, and tries to match less patterns;
txt -> (if failed) foo

A really good book to read is Mastering Regular Expressions, if you are new to
regular expressions (of course, read perlre), and a great article on how the RE
engine works is at http://perl.plover.com/Regex/article.html 

Cheers,
Kevin

-- 
[Writing CGI Applications with Perl - http://perlcgi-book.com]
It would be easier to pay off the national debt overnight than to neutralize
the long-range effects of OUR NATIONAL STUPIDITY.
        -- Frank Zappa

Reply via email to