Re: More Help: Complex Regex

Curtis Jewell Tue, 24 Apr 2001 13:09:24 -0700
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 24 Apr 2001 [EMAIL PROTECTED] wrote:

> Thank you very much for all the great help I received earlier on extracting
> numbers from text.  There is only one thing I forgot about:
>
> Some of the files have HTML headers and footers.  I don't want any data
> inside HTML brackets.   I tried:
>
>   s/<*>//g;

Try perl -e '$x = "<<<<>x"; $x =~ s/<*>//g; print $x . "\n"'

It prints out "x".

Your regex will match strings like <>, <<>, <<<<<<<>. (i.e. multiple
less-thans followed by a greater-than) because the * applies to the
less-than symbol immediately before it.

Try s/<.*>//g - the . means "any character" and will eliminate a
less-than, then 0 or more characters, then a greater than.

- --Curtis

> I don't understand why this doesn't work. I actually just need s/<*\d*>//g
> because the other expressions are automatically taken care of by the
> expressions that delete all text.  Why doesn't the above work for removing
> html tags?  I have several perl books and none say the character "<" is
> reserved. "\<" doesn't work either.
>
> Thanks for any help.  (Actually, thanks for writing my program for me;
> although I'm trying hard to do it myself.)   ;-)

- -- 
Curtis Jewell          http://curtis.livejournal.com/
[EMAIL PROTECTED]      http://web.missouri.edu/~csjc05/
[EMAIL PROTECTED] http://new-york.utica.mission.net/
Public Key: http://web.missouri.edu/~csjc05/curtis.key.txt

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE65eISNGcErwayIw4RAiMqAJ9vzuLOMP7q6s+GzSYp/EpA/hmxSwCfdhCx
6XHvnzKseahRIpFcMITuFWI=
=9X1p
-----END PGP SIGNATURE-----
Re: More Help: Complex Regex

Reply via email to