On Thu, Jul 24, 2008 at 2:37 PM, John Campbell <[EMAIL PROTECTED]> wrote: > On Thu, Jul 24, 2008 at 2:19 PM, Michael B Allen <[EMAIL PROTECTED]> wrote: >> Does anyone have a good PCRE for matching URLs? >> >> Or perhaps someone can improve (or correct) the expression I'm using >> currently: >> >> $expr = '[a-zA-Z0-9]{1,10}://[a-zA-Z0-9.-]+[\p{L}/~._-]*|mailto:[EMAIL >> PROTECTED]'; >> > > I am not sure I completely understand what you are trying to do, but > it doesn't look like you are matching + or %.
You mean in the path? In the path I suppose I should permit quite a few more characters (I forgot 0-9 too). This makes the expression: $expr = '[a-zA-Z0-9]{1,10}://[a-zA-Z0-9.-]+[\p{L}0-9!$%&\\()+-./;=^_~]*|mailto:[EMAIL PROTECTED]', > What is the context for the matching? This will be used to pick out URLs in Creole Wiki markup. Which incedentally is not supposed to match characters that can occur naturally at the end of a sentence (,.?!:;"') so I guess I need to leave out '.' and ';' for my particular application. So given markup: Please visit http://www.yahoo.com/usèrs+100%&lusers$/~jerry/y_a-n.g/Yahoo;=^(!)foo. The regex should match (minus the dot at the end): [http://www.yahoo.com/usèrs+100%&lusers$/~jerry/y_a-n.g/Yahoo;=^(!)foo] although in practice a URL this crazy should probably be formalized with square brackets as defined by Creole for links. Mike -- Michael B Allen PHP Active Directory SPNEGO SSO http://www.ioplex.com/ _______________________________________________ New York PHP Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk NYPHPCon 2006 Presentations Online http://www.nyphpcon.com Show Your Participation in New York PHP http://www.nyphp.org/show_participation.php