This one time, at band camp, Jeff Waugh said: >I need to get the first URL found in a file or stdin. Much like urlview (man >urlview for a regexp), but without all the UI guff. Think procmail... > >As usual, least amount of processes spawned, most minimal software used, and >shortest length wins. ;) sed, which is nice and small, and has a shorter bootup time than perl, just the 1 process... but the regex is a killer. oh wait, did you only want the first? damn. pipe it to head -1. -- jamesw Always two there are; a Bastard, and a PFY.
#!/bin/sed -nf # snarfs urls from stdin # (c) 2001 Jamie Wilkinson <[EMAIL PROTECTED]> # released under the GPL and all that. # known bugs: the first regex (commented) will match urls that have no protocol:// # but for urls that *do* have them and start with www or w3 or web, the protocol:// # will be stripped. # ultramegamega regex #s#^.*\(\(\(\(\(\(http\(s\|\)\|ftp\)://\)\|\(mailto\|news\):\)\|\(www\|w3\|web\)\.\)[-a-zA-Z0-9]\+\(\.[-a-zA-Z0-9]\+\)*\|file://\)\(\/[^' ()<>"]*\)*\).*$#\1#p # mega regex s#^.*\(\(\(\(\(http\(s\|\)\|ftp\)://\)\|\(mailto\|news\):\)[-a-zA-Z0-9]\+\(\.[-a-zA-Z0-9]\+\)*\|file://\)\(\/[^' ()<>"]*\)*\).*$#\1#p