Re: Favicon spam
Greg Miller wrote: Last I heard, the industry averages were supposed to be something like 3:1 pageviews-to-users ratio and 50% repeat visitors. So the number of favicon 404s would be approximately 1/6 of the total number of pageviews. That would only be true if every site consisted of just a single page, which is clearly untrue. From what I've read so far, the current implementation requests the favicon once for each domain. Erm, no. It would be *untrue* if each site consisted of a single page. Yeah, sorry, I misread what you'd written. Is this per web site, or per domain name? I'm not sure how relevant those figures are anyway, they certainly don't gell with the patterns I've seen on sites on which I have access to the statistics. There are few sites these days on which you can navigate to what you what by visiting just three pages, and those on which you can are likely to be part of a number of sites hosted on a single domain (i.e. geocities.com). I imagine the above industry averages are largely influenced by behemoths like AOL and MSN. account the average number of images/stylesheets/javascript appearing in external files. As this should be based on resources requested, not pageviews as that is misleading. I thought I was quite clear about the fact that this was only a matter of pageviews. I don't know of any good web-wide stats for requests or bandwidth, and I suspect no useful stats could be determined since things vary too widely. Personally I think in this case specific examples would be far more useful than industry averages anyway, as they are far to swayed by huge hosts. You should probably also take into account the % of /favicon.ico associated with domains, as those wouldn't appear as 404s (i.e. Netscape Enterprise Server seems to come with one as default). From a bandwidth perspective, those are even worse than 404s. As I mentioned before, averages are no consolation to the people getting hit with worst-case scenarios. But I thought part of your argument was about 404 errors in weblogs, these wouldn't occur when favicons already exist, so in that case its no more a bandwidth problem than any other image. The 404 issue is the major problem with this, requesting resources that don't exist, rather than the bandwidth. ian.
Re: Favicon spam
Greg Miller wrote: That's not a terrible increase in bandwidth (the exact figures would depend on protocol overhead and such), but web hosts have a nasty habit of charging for disk space, which often includes the space for those log files that shoot up by over 20% if everyone adopts this favicon practice or 7% with the hypothetical 30% marketshare that was mentioned earlier. It might be going out on a limb, but it sounds as though the real bandwidth problem is the collection of logfiles to generate statistics... I've encountered this, the logfiles tend to take up far more space than the websites they cater for and quickly eat up gigabytes of space, but this is really a different issue that argues for better management of logfiles. If your site is small I see little point in collecting anything but minimal filtered statistics, a summary rather than lots of raw data. As a user I've found the feature quite useful, especially when using tabbed browsing, and I can't see that either Mozilla, Konqueror or even Netscape, have the clout to get people to put link's to a favicon on every page of their site. Whereas I've been surprised by how many sites do have them... Though, as useful as I find this, I think that checking for favicons when: 1) bookmarking 2) visiting a bookmarked site without a cached 404 for the favicon would be a better compromise than the current one. The reason being that you'd get a favicon for you most visited sites. Why would I want an icon cached for any old site I just happened to visit? They're only really useful for sites I visit regularly. 2) is for sites I already haved bookmarked which may not yet have acquired a favicon. It may cause a bit of noise in logs, but a far more acceptable amount, takes advantage of the caching of favicon status and only comes from visitors who care enough to bookmark your pages. You could also have a pref use favourite icons for bookmarks to let this be turned on or off. ian.
Re: Favicon spam
Greg Miller wrote: Jonas Sicking wrote: It would be really interesting to get some hard numbers on this. Just looking at the current logs will not really say anything since very few people browse with a mozilla with this pref turned on. So we need to come up with some way to approximate the number of 404s per (for example) month in the event of a browser with, say, 30% marketshare using the current configuration. Last I heard, the industry averages were supposed to be something like 3:1 pageviews-to-users ratio and 50% repeat visitors. So the number of favicon 404s would be approximately 1/6 of the total number of pageviews. That would only be true if every site consisted of just a single page, which is clearly untrue. From what I've read so far, the current implementation requests the favicon once for each domain. So you're number above needs to be divided by the average number of pages visited by a single user on a server. You also need to take into account the average number of images/stylesheets/javascript appearing in external files. As this should be based on resources requested, not pageviews as that is misleading. So it should actually be: 1/(6*visited pages per server*resources per page) To fill in some numbers pulled from the air: 1/(6*10*10) So that accounts to 1/6000 resource requests. If you can come up with some numbers to fill in the above guesses then you'd get closer to the actual figure. You should probably also take into account the % of /favicon.ico associated with domains, as those wouldn't appear as 404s (i.e. Netscape Enterprise Server seems to come with one as default). ian.
Re: Favicon spam
Ian Davey wrote: 1/(6*10*10) So that accounts to 1/6000 resource requests. If you can come up with some numbers to fill in the above guesses then you'd get closer to the actual figure. That should be 1/600 - it's too early in the morning :-) ian.
Re: Favicon spam
Ian Davey wrote: Greg Miller wrote: Last I heard, the industry averages were supposed to be something like 3:1 pageviews-to-users ratio and 50% repeat visitors. So the number of favicon 404s would be approximately 1/6 of the total number of pageviews. That would only be true if every site consisted of just a single page, which is clearly untrue. From what I've read so far, the current implementation requests the favicon once for each domain. Erm, no. It would be *untrue* if each site consisted of a single page. So you're number above needs to be divided by the average number of pages visited by a single user on a server. You also need to take into Already did that. That's what the 3:1 figure was for. account the average number of images/stylesheets/javascript appearing in external files. As this should be based on resources requested, not pageviews as that is misleading. I thought I was quite clear about the fact that this was only a matter of pageviews. I don't know of any good web-wide stats for requests or bandwidth, and I suspect no useful stats could be determined since things vary too widely. You should probably also take into account the % of /favicon.ico associated with domains, as those wouldn't appear as 404s (i.e. Netscape Enterprise Server seems to come with one as default). From a bandwidth perspective, those are even worse than 404s. As I mentioned before, averages are no consolation to the people getting hit with worst-case scenarios.
Re: Favicon spam
The icon is not cached forever. It simply has no specified expiration. That just means it won't be doomed based only off some expiration date. It can still be removed from the cache as the cache fills up and needs to evict items. dave ([EMAIL PROTECTED]) Jonas Sicking wrote: A lot of oppinions has been expressed with regard to if the favicon should be default on or off since it might spam webservers with requests to a non-existing file. It would be really interesting to get some hard numbers on this. Just looking at the current logs will not really say anything since very few people browse with a mozilla with this pref turned on. So we need to come up with some way to approximate the number of 404s per (for example) month in the event of a browser with, say, 30% marketshare using the current configuration. Since the absence of a /favicon.ico is cached the number of 404-ing requests will be much lower then the numbers of pagehits. Brendan says that the absense is cached persistently and with never-expire, does that mean that mozilla won't request /favicon.ico again unless the user manually clears the cache? In that case the number of 404s will be approximatly equal to the number of new users every month * 30%. If it's not possible to extract the number of new users from the logs i think that the number of new IP-addresses * 1.5 is a good enough estimation. There are probably more then 1.5 user per IP on average, but all users probably don't visit the site. If someone have a better number then 1.5, please speak up, my guess is very uneducated. However it seems a bit wrong to me that a resource is cached forever. What if a site want to start supporting /favicon.ico? Will only new users see the new icon? IMHO a resource should be reloaded at least sometime so that if the resource appears/changes we will eventually catch it. So say that we reload every 2 weeks. That means every user will reload /favicon.ico once every 14th day, which means that the number of 404s will be number of destict users during 14-days * 30% * 30/14. So, we've got: Hits = newUsersPerMonth * 0.3 if we cache indefenetly Hits = distinctUsersPerXDays * 0.3 * 30/X if we refetch every X days Where IP-addresses * 1.5 could approximate number of users. IMHO the right thing would be to use the second formula with X ~= 14. So it would be great if someone with access to the logs to a rather heavily used site could run these formulas and compare that to the number of normal 404s. / Jonas Sicking
Re: Favicon spam
Jonas Sicking wrote: It would be really interesting to get some hard numbers on this. Just looking at the current logs will not really say anything since very few people browse with a mozilla with this pref turned on. So we need to come up with some way to approximate the number of 404s per (for example) month in the event of a browser with, say, 30% marketshare using the current configuration. Last I heard, the industry averages were supposed to be something like 3:1 pageviews-to-users ratio and 50% repeat visitors. So the number of favicon 404s would be approximately 1/6 of the total number of pageviews. However, that's only an average and the effect on the number of requests and bandwidth consumed would vary wildly depending on the individual site. Every site without a favicon would suffer--it's just a question of degree. Good thing no browser I'm aware of has an equivalent policy for CSS (which would benefit me at the expense of people without external CSS), JS (which would benefit people using external JS files at the expense of everyone else), etc.