This one time, at band camp, Jeff Waugh said:
>I need to get the first URL found in a file or stdin. Much like urlview (man
>urlview for a regexp), but without all the UI guff. Think procmail...
>
>As usual, least amount of processes spawned, most minimal software used, and
>shortest length wins. ;)

sed, which is nice and small, and has a shorter bootup time than perl,
just the 1 process... but the regex is a killer.

oh wait, did you only want the first?  damn.  pipe it to head -1.

-- 
jamesw

Always two there are; a Bastard, and a PFY.
#!/bin/sed -nf
# snarfs urls from stdin
# (c) 2001 Jamie Wilkinson <[EMAIL PROTECTED]>
# released under the GPL and all that.
# known bugs:  the first regex (commented) will match urls that have no protocol://
# but for urls that *do* have them and start with www or w3 or web, the protocol://
# will be stripped.

# ultramegamega regex
#s#^.*\(\(\(\(\(\(http\(s\|\)\|ftp\)://\)\|\(mailto\|news\):\)\|\(www\|w3\|web\)\.\)[-a-zA-Z0-9]\+\(\.[-a-zA-Z0-9]\+\)*\|file://\)\(\/[^'
       ()<>"]*\)*\).*$#\1#p

# mega regex
s#^.*\(\(\(\(\(http\(s\|\)\|ftp\)://\)\|\(mailto\|news\):\)[-a-zA-Z0-9]\+\(\.[-a-zA-Z0-9]\+\)*\|file://\)\(\/[^'
        ()<>"]*\)*\).*$#\1#p

Reply via email to