If someone is in need to parse URIs, then the following regex handle it:

    | regStr regex |

    regStr :=     
        '((([a-z]\w+\:)',                     "Match URL protocol and colon"
        '(/|//|///|[A-Za-z0-9%]))',          "Match 1-3 slashes or
single letter or digit or %"
        '|',                                 "or"
        '(((www\d)|(www\d\d)|(www\d\d\d))[.])',    "match www ou
www[1-999] followed by ."
        '|',                                   "or"
       
'([A-Za-z0-9._\-]+[.]([a-z][a-z]|[a-z][a-z][a-z]|[a-z][a-z][a-z][a-z])/))',
"domain name"   
        '(([^\s()<>]+',                      "run of non-space, non ()<>"
        '|',                                 "or"
        '\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+)', "balanced parens up to
two levels"
        '(\(([^\s()<>]+|(\([^\s()<>]+\)))*\)',   "end with balanced
aprens up to 2 levels"
        '|',                                 "or"
        '[^\s`!()[]{};:''".,<>?«»“”‘’])'.    "not a space or one of
these punct chars"
                   
    regex := RxMatcher forString: regStr.

-- 
The information contained in this message is confidential and intended
to the recipients specified in the headers. If you received this message
by error, notify the sender immediately. The unauthorized use,
disclosure, copy or alteration of this message are strictly forbidden
and subjected to civil and criminal sanctions.

==

This email may be signed using PGP key *ID: 0x4134A417*

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to