On Sat, 24 Dec 2005 00:17:30 +0100, freenetwork at web.de wrote: > On Fri, 23 Dec 2005 22:35:54 +0200, Jusa Saari wrote: > >>On Mon, 19 Dec 2005 17:48:31 +0000, Matthew Toseland wrote: >> >>> On Mon, Dec 19, 2005 at 02:22:36PM +0200, Jusa Saari wrote: >>>> >>>> A simple solution is to have FProxy parse the HTML and identify links >>>> (which it must do anyway to filter images loaded from the Web and >>>> whatever) and add images and other page requisites to the download >>>> queue without the browser needing to ask each of them separately. >>>> Still won't help for getting multiple pages at once, but at least >>>> getting an image-heavy page becomes faster. This would also combat the >>>> effect which makes other than the topmost images drop off the network >>>> since no one has the patience to wait for them. >>> >>> Wouldn't help much. >> >>Would you please either be a bit more verbose than that, or not reply at >>all ? Anyone who knew _why_ it won't help didn't get anything from your >>answer, and anyone who didn't know (such as I) still doesn't, making your >>answer completely useless to anyone and therefore a waste of both your >>and your readers time as well as diskspace and bandwith in whatever >>server(s) this list is stored on. >> >>Now, the theory is that an image-heavy website is slow to load because >>the browser only requests two items simultaneously, meaning that the high >>latencies involved with each request add up; and FProxy making requests >>for content it knows the browser will soon ask helps because that way the >>content beyond the first few images will already be cached when the >>browser arrives there, eliminating any noticeable latency. >> >>Please explain why this theory is wrong ? >> >>The theory about bitrot combatting effect is directly linked to the high >>latency, and the tendency of browsers to requests images in page in the >>order they appear in page source. The user simply hits the stop button >>(or the browser times out the page) before the bottom images are loaded; >>because of this, they are never requested, and consequently fall off the >>network. FProxy automatically queuing the images for download would >>ensure that the bottommost images are requested every time the page is >>loaded, ensuring that they stay in the network as long as the page does. >> >>Please explain why this theory is wrong ? > > So essentially you're asking if FProxy could spider whole sites > recursively (or only for one or X levels of depth) in the background every > time the user hits a site... ?
No, I want FProxy to retrieve all the images pointed to by the "img" tags in the page. There is no recursion there, since no HTML files are loaded in this matter. I just want image galleries to be usefull without having to resort to FUQID. As is, they aren't. > If yes, the impact on the network would be interesting to see when > fetching TFE or any other index site... If the network's well structured > it won't break upon the millions of requests... otherwise... :) Of course it will break in your scenario. It will be impossible to distinguish between often-retrieved and never-retrieved content, so the caching facilities won't work properly, and the network will grind to a screeching halt as it gets hopelessly overloaded, making it impossible to retrieve anything.
