Re: How to get the text of web framed pages?

Eric Chatonet Tue, 24 May 2005 13:50:19 -0700

Hi Stephen,

Thanks for your input.

At the moment, I have 2 efficient functions with 2 differentapproaches for extracting all urls from the text of a framed page:The first one from Ken Ray uses regex with machText and another one Iwrote uses items with quote as the item delimiter.Ken's solution is a bit slow but more reliable than mine that is muchfaster but a little bit silly ;-)I go on digging in and I shall share solutions on this list when itwill be solid enough.


Le 24 mai 05 à 20:02, Stephen Barncard a écrit :

I would look for the word frameset in a tag inside a page, then getall the valid URLS inside the frame. Then I would check each URLfor size, and pick the largest file, or the number of lines. Thatwill be where the main content is.


Best regards from Paris,

Eric Chatonet.
----------------------------------------------------------------
So Smart Software

For institutions, companies and associations
Built-to-order applications: management, multimedia, internet, etc.
Windows, Mac OS and Linux... With the French touch

Plugins, tutorials and more on our website
----------------------------------------------------------------
Web site        http://www.sosmartsoftware.com/
Email        [EMAIL PROTECTED]/
Phone        33 (0)1 43 31 77 62
Mobile        33 (0)6 20 74 50 86
----------------------------------------------------------------

_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
http://lists.runrev.com/mailman/listinfo/use-revolution

Re: How to get the text of web framed pages?

Reply via email to