-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>>>>> "CB" == Chandrashekar B <[EMAIL PROTECTED]> writes:

    CB> Binand Sethumadhavan wrote:
    >> It also won't work in a directory layout like this:
    >> 
    >> .  ./foo.html ./morehtmlfiles ./morehtmlfiles/a.html
    >> ./morehtmlfiles/b.html ./morehtmlfiles/c.html ...
    >> 
    >> 
    CB> A quick and dirty modification to the command as below must
    CB> handle the above case: ( for i in `find . -type f | grep
    CB> "\.html$"`; do ./href.pl < $i; done )
    >> url_list.txt

    CB> Then again, all this with the assumption that the relevant
    CB> files end with .html . This solution isn't bullet-proof
    CB> either.

This one seems to work for files with spaces in the names, files with
extensions other than .html, case-insensitive HTML tags and wrapped
href's like this:

       <a
          href="a.very.long.url.here.that.gets.wrapped.by.smart.editors">

find . -type f -print0 | xargs -0 file | fgrep 'HTML document text' | cut -d: -f1 | 
while read f ; do perl -ne 'undef $/ ; while(s/<a\s*href="([^>]*)">//is){print 
"$1\n";}' "$f" ; done

If you want me to explain it it'll cost you lots of mango colada's at
CCD -- the real thing, not promises ;)

Regards,

- -- Raju
- -- 
Raj Mathur                [EMAIL PROTECTED]      http://kandalaya.org/
       GPG: 78D4 FC67 367F 40E2 0DD5  0FEF C968 D0EF CC68 D17F
                      It is the mind that moves
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iD8DBQFBKAuGyWjQ78xo0X8RAsZ4AJ4jI4mVt4R7HaQcneAb5dzeGTjFpgCfas1N
worBQMM+LjHWNVvv7jeyUBg=
=aRXL
-----END PGP SIGNATURE-----


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
linux-india-help mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/linux-india-help

Reply via email to