At 15:06 27/06/2011 +0100, Séamas Ó Brógáin wrote:
I wonder if some regex wizard knows of a way to
delete HTML (or XML or whatever) tags from a
text file: in other words, selecting everything
from each occurrence of < to the next occurrence of >.
I'm not sure I qualify as a wizard, but I'll rise
to the bait! Try searching for
<[^>]*>
and replacing with nothing. The circumflex means
"not", and combined with ">" in (square) brackets
as [^>] matches any character which is not
">". The asterisk extends this to match zero or
more of such characters. Combined with "<" and
">", this matches what you need.
I trust this helps.
Brian Barker
--
Unsubscribe instructions: E-mail to users+h...@global.libreoffice.org
In case of problems unsubscribing, write to postmas...@documentfoundation.org
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted