Hi all,

I need some scripting help to extract some urls from a directory of web pages. 
The script should run on the directory, extract all the urls in the html 
files and put them into a new file, one below the other . I did a google 
search but could not come up with what i needed. I also need a script that 
strips all the html tags from webpages , removes duplicate words like "in, 
an, is, the" and places each word below the previous one.

I have no experience with scripting of any sort . Could some one help me out 
please .The requirement is rather urgent. 


Thanks and regards,

Raghuram.


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
linux-india-help mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/linux-india-help

Reply via email to