Hi Stephen,
Thanks for your input.
At the moment, I have 2 efficient functions with 2 different
approaches for extracting all urls from the text of a framed page:
The first one from Ken Ray uses regex with machText and another one I
wrote uses items with quote as the item delimiter.
Ken's solution is a bit slow but more reliable than mine that is much
faster but a little bit silly ;-)
I go on digging in and I shall share solutions on this list when it
will be solid enough.
Le 24 mai 05 à 20:02, Stephen Barncard a écrit :
I would look for the word frameset in a tag inside a page, then get
all the valid URLS inside the frame. Then I would check each URL
for size, and pick the largest file, or the number of lines. That
will be where the main content is.
Best regards from Paris,
Eric Chatonet.
----------------------------------------------------------------
So Smart Software
For institutions, companies and associations
Built-to-order applications: management, multimedia, internet, etc.
Windows, Mac OS and Linux... With the French touch
Plugins, tutorials and more on our website
----------------------------------------------------------------
Web site http://www.sosmartsoftware.com/
Email [EMAIL PROTECTED]/
Phone 33 (0)1 43 31 77 62
Mobile 33 (0)6 20 74 50 86
----------------------------------------------------------------
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
http://lists.runrev.com/mailman/listinfo/use-revolution