Hi all, I need some scripting help to extract some urls from a directory of web pages. The script should run on the directory, extract all the urls in the html files and put them into a new file, one below the other . I did a google search but could not come up with what i needed. I also need a script that strips all the html tags from webpages , removes duplicate words like "in, an, is, the" and places each word below the previous one.
I have no experience with scripting of any sort . Could some one help me out please .The requirement is rather urgent. Thanks and regards, Raghuram. ------------------------------------------------------- SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 _______________________________________________ linux-india-help mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/linux-india-help
