John McKown wrote: >On Sat, 24 Jan 2004, Marcelo wrote: > >> Which regular expression would you use to remove the <title> and >> </title> from a line like this one: >> >> <title>Here goes a webpage's title</title> >> >> Thanks a lot in advance. >> > >Did you what that _exact_ input? I.e. always <title>...</title>? If so, >that's rather easy. > >$line =~ s/<title>(.*)<\/title>/$1/ > >Now, if you want the more general form of <any_tag>...</any_tag>, that is >removing paired HTML tags, that's more difficult. Luckily, it is an >example in "Programming PERL, 3rd Edition" on page 184 which is close. > >line =~ s/(<.*?>)(.*?)(?:</\1>)/$2/
I remember reading that using regex to parse HTML is not reliable. You should use HTML::Parse from CPAN. HTH, Jan -- Either this man is dead or my watch has stopped. - Groucho Marx -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>