Re: mod_disk_cache summarization

2006-10-29 Thread Graham Leggett
Henrik Nordstrom wrote: How ETag:s is generated is extremely server dependent, and not guaranteed to be unique across different URLs. You can not at all count on two files having the same ETag but different URLs to be the same file, unless you also is responsible for the server providing all the

Re: mod_disk_cache summarization

2006-10-28 Thread Niklas Edmundsson
On Fri, 27 Oct 2006, Graham Leggett wrote: Niklas Edmundsson wrote: Different VHosts meaning different URLs/directories, pointing to the same files... Hmm... Two thoughts come into my head over this one. One way to approach this is to treat this as a general problem of how do we stop peopl

Re: mod_disk_cache summarization

2006-10-27 Thread Henrik Nordstrom
lör 2006-10-28 klockan 00:21 +0200 skrev Henrik Nordstrom: > fre 2006-10-27 klockan 23:33 +0200 skrev Graham Leggett: > > > A second approach could involve the use of the Etags associated with > > file responses, which in the case of files served off disk (as I > > understand it) are generated b

Re: mod_disk_cache summarization

2006-10-27 Thread Henrik Nordstrom
fre 2006-10-27 klockan 23:33 +0200 skrev Graham Leggett: > A second approach could involve the use of the Etags associated with > file responses, which in the case of files served off disk (as I > understand it) are generated based on inode number and various other > uniquely file specific info

Re: mod_disk_cache summarization

2006-10-27 Thread Guy Hulbert
On Fri, 2006-27-10 at 17:46 -0400, Guy Hulbert wrote: > > have) from downloading multiple copies of the same file hosted at > > different URLs. > > Isn't this what URI is supposed to do (identify mirrored resources as > being the same). OTOH, the first case (two vhosts pointing at the seems i a

Re: mod_disk_cache summarization

2006-10-27 Thread Guy Hulbert
On Fri, 2006-27-10 at 23:33 +0200, Graham Leggett wrote: > > Different VHosts meaning different URLs/directories, pointing to > the > > same files... > > Hmm... Two thoughts come into my head over this one. > > One way to approach this is to treat this as a general problem of how > do > we stop

Re: mod_disk_cache summarization

2006-10-27 Thread Graham Leggett
Niklas Edmundsson wrote: Different VHosts meaning different URLs/directories, pointing to the same files... Hmm... Two thoughts come into my head over this one. One way to approach this is to treat this as a general problem of how do we stop people who download the same file from multiple pl

Re: mod_disk_cache summarization

2006-10-27 Thread Niklas Edmundsson
On Tue, 24 Oct 2006, Graham Leggett wrote: On Tue, October 24, 2006 2:48 pm, Niklas Edmundsson wrote: Perhaps this could be as simple as using ServerName and ServerAlias (unless the name of the site is part of the URL, which will happen in the forward proxy case) to reduce the cached URL to a

Re: mod_disk_cache summarization

2006-10-27 Thread Graham Leggett
On Fri, October 27, 2006 4:43 pm, Davi Arnaut wrote: > The code is a bit outdated, I have integrated it onto a generic event > abstraction for apr a la kevent that I'm working on. > > I'll try to post something this weekend to the list with the appropriate > documentation. This will be awesome -

Re: mod_disk_cache summarization

2006-10-27 Thread Davi Arnaut
Graham Leggett wrote: > On Mon, October 23, 2006 10:50 pm, Davi Arnaut wrote: > >> AFAIK all major platforms provide one, even win32. I even made a >> incomplete APR abstraction for file notification: >> >> http://haxent.com/~davi/apr/notify > > Is it possible to add doxygen comments on the apr_n

Re: mod_disk_cache summarization

2006-10-27 Thread Graham Leggett
On Mon, October 23, 2006 10:50 pm, Davi Arnaut wrote: > AFAIK all major platforms provide one, even win32. I even made a > incomplete APR abstraction for file notification: > > http://haxent.com/~davi/apr/notify Is it possible to add doxygen comments on the apr_notify.h file, so as to confirm exa

Re: mod_disk_cache in trunk, was Re: mod_disk_cache summarization

2006-10-24 Thread Graham Leggett
On Tue, October 24, 2006 7:05 pm, Paul Querna wrote: > Well, yes, you are right, I finally had time to read some of the changes > in trunk, and r450105 is freaking crazy: > http://svn.apache.org/viewvc?view=rev&revision=450105 > > It replaced a cheap atomic operation, with copying the entire file

mod_disk_cache in trunk, was Re: mod_disk_cache summarization

2006-10-24 Thread Paul Querna
Graham Leggett wrote: On Tue, October 24, 2006 9:31 am, Paul Querna wrote: The prerequisite is that APR needs to be taught about this scheme, and it has to work portably across all platforms. No it doesn't. mod_disk_cache makes many assumptions about the underlying OS, like how moving a file

Re: mod_disk_cache summarization

2006-10-24 Thread Graham Leggett
On Tue, October 24, 2006 5:32 pm, Davi Arnaut wrote: > 1) we have two cache file extensions, one for fully cached entities > (.cache) and other for transient (being cached) entities (.transient) > 2) and that we store the headers with cache the file as an extended > attribute > 3) we write an exte

Re: mod_disk_cache summarization

2006-10-24 Thread Davi Arnaut
Graham Leggett wrote: > On Tue, October 24, 2006 3:46 pm, Joe Orton wrote: > >> That's not the point - the scary complexity of all this is that it's >> become a multi-process synchronisation problem - what do you do when the >> writing process SIGSEGVs or hangs? You're left with N processes hangi

Re: mod_disk_cache summarization

2006-10-24 Thread Graham Leggett
On Tue, October 24, 2006 4:37 pm, Brian Akins wrote: > My thought on this is that we use providers, so in theory, you could use a > different provider for the different types: > > CacheEnable /largecrap large_disk_with_stat_sleep_thing > CacheEnable /normalstuff normal_disk This we can definitely

Re: mod_disk_cache summarization

2006-10-24 Thread Brian Akins
Plüm wrote: Agreed. If it turns out that the common code base between both cases is only small and it is complex to do both things in one provider just make two providers out of them. The remaining common code could be factored out in a separate disk_cache_util c file which is used by both provi

Re: mod_disk_cache summarization

2006-10-24 Thread Plüm , Rüdiger , VF EITO
> -Ursprüngliche Nachricht- > Von: Brian Akins > Gesendet: Dienstag, 24. Oktober 2006 16:37 > An: dev@httpd.apache.org > Betreff: Re: mod_disk_cache summarization > > > Niklas Edmundsson wrote: > >> The comparison of your and Brian's exp

Re: mod_disk_cache summarization

2006-10-24 Thread Brian Akins
Niklas Edmundsson wrote: The comparison of your and Brian's experience are two ends of extremes on high volume caches, one low hits large files, the second high hits small files. This should make for some useful tuning information. The extreme difference is what makes me think that we should ac

Re: mod_disk_cache summarization

2006-10-24 Thread Graham Leggett
On Tue, October 24, 2006 3:46 pm, Joe Orton wrote: > That's not the point - the scary complexity of all this is that it's > become a multi-process synchronisation problem - what do you do when the > writing process SIGSEGVs or hangs? You're left with N processes hanging > around indefinitely wait

Re: mod_disk_cache summarization

2006-10-24 Thread Graham Leggett
On Tue, October 24, 2006 2:48 pm, Niklas Edmundsson wrote: >> Perhaps this could be as simple as using ServerName and ServerAlias >> (unless the name of the site is part of the URL, which will happen in >> the >> forward proxy case) to reduce the cached URL to a canonical form before >> storing an

Re: mod_disk_cache summarization

2006-10-24 Thread Joe Orton
On Tue, Oct 24, 2006 at 02:47:09PM +0200, Graham Leggett wrote: > On Tue, October 24, 2006 2:22 pm, Joe Orton wrote: > > Neither is it appropriate to have any process do the "sleep and stat" > > loop waiting for some other process to finish writing a cache file. > > Correct, thus a notify API was

Re: mod_disk_cache summarization

2006-10-24 Thread Niklas Edmundsson
On Tue, 24 Oct 2006, Joe Orton wrote: IMO: for a general purpose cache it is not appropriate to stop and try to write the entire response to the cache before serving anything. This is existing mod_disk_cache behaviour, the patches reduces these problems. Maybe not in a perfect way, but in a w

Re: mod_disk_cache summarization

2006-10-24 Thread Niklas Edmundsson
On Tue, 24 Oct 2006, Graham Leggett wrote: * Allow disk cache to realise that a (large) file is the same regardless of which URL is used to access it. Reduces cache disk usage a lot for sites like ours that's known by ftp.acc.umu.se, ftp.se.debian.org, ftp.gnome.org, se.releases.ubuntu.

Re: mod_disk_cache summarization

2006-10-24 Thread Graham Leggett
On Tue, October 24, 2006 2:22 pm, Joe Orton wrote: >> In essence, the patches solve the thundering herd problem. > > I still think it's fundamentally wrong to try to "fix" that problem in > this way. It seems like the cache is being re-implemented to optimize > for some very specific deployment s

Re: mod_disk_cache summarization

2006-10-24 Thread Joe Orton
On Mon, Oct 23, 2006 at 10:11:58PM +0200, Graham Leggett wrote: > Brian Akins wrote: > > >Can someone please summarize the various patches for mod_disk_cache that > >have been floating around in last couple weeks? I have looked at the > >patches but wasn't real sure of the general philosophy/me

Re: mod_disk_cache summarization

2006-10-24 Thread Graham Leggett
On Tue, October 24, 2006 12:59 pm, Niklas Edmundsson wrote: > * More assorted small cleanups (mostly error handling). Error handling patches are welcome and encouraged, don't wait :) > * Allow disk cache to realise that a (large) file is the same >regardless of which URL is used to access it

Re: mod_disk_cache summarization

2006-10-24 Thread Niklas Edmundsson
On Mon, 23 Oct 2006, Graham Leggett wrote: Was busy cleaning up some other odds and ends, will be back on the cache code again shortly. I'm awaiting the verdict on how to resolve the "lead request hangs" problem before I submit more patches, I feel it's important enough to be solved before I

Re: mod_disk_cache summarization

2006-10-24 Thread Graham Leggett
On Tue, October 24, 2006 9:31 am, Paul Querna wrote: >> The prerequisite is that APR needs to be taught about this scheme, and >> it has to work portably across all platforms. > > No it doesn't. mod_disk_cache makes many assumptions about the > underlying OS, like how moving a file on the same fi

Re: mod_disk_cache summarization

2006-10-24 Thread Paul Querna
Graham Leggett wrote: Davi Arnaut wrote: Have you seen my patch to address this issue ? IMHO, it is far less complex and less expensive then the committed workaround. No - I went through your patches in some detail, but I didn't see one that addressed this problem specifically. Once thunderi

Re: mod_disk_cache summarization

2006-10-23 Thread Graham Leggett
Davi Arnaut wrote: The prerequisite is that APR needs to be taught about this scheme, and it has to work portably across all platforms. AFAIK all major platforms provide one, even win32. I even made a incomplete APR abstraction for file notification: http://haxent.com/~davi/apr/notify Can y

Re: mod_disk_cache summarization

2006-10-23 Thread Davi Arnaut
Graham Leggett wrote: > Davi Arnaut wrote: > >> Have you seen my patch to address this issue ? IMHO, it is far less >> complex and less expensive then the committed workaround. > > No - I went through your patches in some detail, but I didn't see one > that addressed this problem specifically. O

Re: mod_disk_cache summarization

2006-10-23 Thread Graham Leggett
Brian Akins wrote: So this does not work in reverse proxy situation? Those buckets are not file-based. It should do - the network input filter isn't going to hand buckets up the filter chain that don't fit in RAM, so special handling isn't required. A normal read bucket / write to cache fil

Re: mod_disk_cache summarization

2006-10-23 Thread Graham Leggett
Davi Arnaut wrote: Have you seen my patch to address this issue ? IMHO, it is far less complex and less expensive then the committed workaround. No - I went through your patches in some detail, but I didn't see one that addressed this problem specifically. Once thundering herd is solved, I p

Re: mod_disk_cache summarization

2006-10-23 Thread Brian Akins
Graham Leggett wrote: What's been committed so far is a temporary workaround to the "4.7GB file buckets are being loaded into RAM" problem. The workaround detects if the bucket being read is a file bucket, and copies the file into the cache using file read and file write, instead of bucket read

Re: mod_disk_cache summarization

2006-10-23 Thread Davi Arnaut
Graham Leggett wrote: > Brian Akins wrote: > >> Can someone please summarize the various patches for mod_disk_cache that >> have been floating around in last couple weeks? I have looked at the >> patches but wasn't real sure of the general philosophy/methodology to them. > > In essence, the pa

Re: mod_disk_cache summarization

2006-10-23 Thread Graham Leggett
Brian Akins wrote: Can someone please summarize the various patches for mod_disk_cache that have been floating around in last couple weeks? I have looked at the patches but wasn't real sure of the general philosophy/methodology to them. In essence, the patches solve the thundering herd probl

mod_disk_cache summarization

2006-10-23 Thread Brian Akins
Can someone please summarize the various patches for mod_disk_cache that have been floating around in last couple weeks? I have looked at the patches but wasn't real sure of the general philosophy/methodology to them. Others may find it useful as well -- Brian Akins Chief Operations Engin