At 15:06 27/06/2011 +0100, Séamas Ó Brógáin wrote:
I wonder if some regex wizard knows of a way to delete HTML (or XML or whatever) tags from a text file: in other words, selecting everything from each occurrence of < to the next occurrence of >.

I'm not sure I qualify as a wizard, but I'll rise to the bait! Try searching for
<[^>]*>
and replacing with nothing. The circumflex means "not", and combined with ">" in (square) brackets as [^>] matches any character which is not ">". The asterisk extends this to match zero or more of such characters. Combined with "<" and ">", this matches what you need.

I trust this helps.

Brian Barker


--
Unsubscribe instructions: E-mail to users+h...@global.libreoffice.org
In case of problems unsubscribing, write to postmas...@documentfoundation.org
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Reply via email to