We (as the Apache Subversion developers) have discovered that mod_dav can trigger an unbounded memory usage when used in conjunction with a module that inserts an output filter — such as mod_headers or mod_deflate. Below is the configuration that can be used to reproduce the issue:
MaxMemFree 128 Header always append Foo Bar <Location /> DAV svn SVNParentPath C:/Repositories </Location> With this configuration, responding to a GET request for a 500 MB file results in the server consuming a large amount (~3.2 GB) of memory. The memory footprint is caused by heap allocations. Serving larger files could quickly exhaust all available memory for the machine. I would be glad to provide a patch for this issue, but I think it's certainly worth discussing the solution beforehand. The problem is caused by how mod_dav passes the output filter list to its providers. Please see the deliver() hook definition in mod_dav.h:1948 and its usage in mod_dav.c:888: /* Repository provider hooks */ struct dav_hooks_repository { ... dav_error * (*deliver)(const dav_resource *resource, ap_filter_t *output); The hook receives the current head of the output filter list for a particular request. Certain filters, such as the one in mod_headers, are designed to perform the work only once. When the work is done, a filter removes itself from the list. If a filter is the first in the list, this updates the head of the linked list in request_rec (r->output_filters). So, after a particular DAV provider calls ap_pass_brigade(), the actual head of the filter list may now be different from what was passed via the argument. But the hook is unaware of that, since the previous head of the linked list was passed via `output` argument. Subsequent calls to the ap_pass_brigade() are going to invoke these (removed) filters again, and this is what happens in the example. The mod_headers filter is called per every ap_pass_brigade() call, it sets the headers multiple times, their values are allocated in the request pool, and this causes unbounded memory usage. Apart from the memory consumption, it could also cause various issues due to the filter being unexpectedly called several times. One way of solving the problem that I can think of is: 1) Introduce dav_hooks_repository2 with the deliver(), deliver_report() and merge() hooks accepting a `request_rec` instead of an `ap_filter_t` A DAV provider would be expected to use r->output_filters, and that's going to handle the case when the head of the filter list is updated. New structure is required to keep compatibility, as dav_hooks_repository is a part of the mod_dav's public API. This would require a couple of compatibility shims for the new API (and the old versions would become deprecated). 2) Implement a workaround for existing DAV providers Do that by inserting a special output filter to the beginning of the list before calling the particular DAV provider hooks. The filter would be a no-op, but it would never remove itself from the list, thus guaranteeing that the head of the filter list doesn't change during the execution of the hook. The downside of this approach is that it doesn't guard against other mistakes of the same kind. It's easy to miss that a head of the filter list can change between invocations, and the consequences are fairly severe. For instance, r1748047 [1] adds another public API, dav_send_one_response(), that receives an `ap_filter_t`, that might change between calls to ap_pass_brigade(), so it has the same kind of problem. Maybe this can solved by introducing something like an `ap_filter_chain_t` that keeps track of the actual head of the list, and starting to use it where a function needs to push data to the filter stack? I am ready to prepare a patch for this issue, but perhaps there's a completely different and a better way to solve the problem? What do you think? [1] https://svn.apache.org/r1748047 Regards, Evgeny Kotkov