Problems with EOS optimisation in ap_core_output_filter() and file buckets.

Graham Dumpleton Fri, 13 Feb 2009 15:25:39 -0800

In ap_core_output_filter() there exists the code starting with:

        /* Completed iterating over the brigade, now determine if we want
         * to buffer the brigade or send the brigade out on the network.
         *
         * Save if we haven't accumulated enough bytes to send, the connection
         * is not about to be closed, and:
         *
         *   1) we didn't see a file, we don't have more passes over the
         *      brigade to perform,  AND we didn't stop at a FLUSH bucket.
         *      (IOW, we will save plain old bytes such as HTTP headers)
         * or
         *   2) we hit the EOS and have a keep-alive connection
         *      (IOW, this response is a bit more complex, but we save it
         *       with the hope of concatenating with another response)
         */
        if (nbytes + flen < AP_MIN_BYTES_TO_WRITE
            && !AP_BUCKET_IS_EOC(last_e)
            && ((!fd && !more && !APR_BUCKET_IS_FLUSH(last_e))
                || (APR_BUCKET_IS_EOS(last_e)
                    && c->keepalive == AP_CONN_KEEPALIVE))) {


This is some sort of optimisation which in the case of a keep alive
connection, will hold over sending data out on connection until later
if EOS is present and amount of data is less than nominal minimum
bytes to send.

Later in this section of code it has:

                    /* Do a read on each bucket to pull in the
                     * data from pipe and socket buckets, so
                     * that we don't leave their file descriptors
                     * open indefinitely.  Do the same for file
                     * buckets, with one exception: allow the
                     * first file bucket in the brigade to remain
                     * a file bucket, so that we don't end up
                     * doing an mmap+memcpy every time a client
                     * requests a <8KB file over a keepalive
                     * connection.
                     */
                    if (APR_BUCKET_IS_FILE(bucket) && !file_bucket_saved) {
                        file_bucket_saved = 1;
                    }
                    else {
                        const char *buf;
                        apr_size_t len = 0;
                        rv = apr_bucket_read(bucket, &buf, &len,
                                             APR_BLOCK_READ);
                        if (rv != APR_SUCCESS) {
                            ap_log_cerror(APLOG_MARK, APLOG_ERR, rv,
                                          c, "core_output_filter:"
                                          " Error reading from bucket.");
                            return HTTP_INTERNAL_SERVER_ERROR;
                        }
                    }

What the end result of the code is, is that if you have a file bucket
getting this far where length of file is less than 8000 and an EOS
follows it, then the actual file bucket is held over rather than data
being read and buffered. This is as commented is to avoid doing an
mmap+memcpy. What it means though is that the file descriptor within
the file bucket must be maintained and cannot be closed as soon as
ap_pass_brigade() has been called.

For me this is an issue as the file descriptor has been supplied from
a special object returned by a higher level application and it would
be hard to maintain the file as open beyond the life of the request,
up till end of keep alive or a subsequent request over same
connection. Doing a dup on the file decriptor is also not necessarily
an option.

The end result then is that later when file bucket is processed, the
file descriptor has already been closed and one gets the error:

  (9)Bad file descriptor: core_output_filter: writing data to the network

I know that I can circumvent the EOS optimisation by inserting a flush
bucket, but based on documentation it isn't gauranteed that a flush
bucket will always propagate down the filter chain and actually push
out data.

/**
 * Create a flush  bucket.  This indicates that filters should flush their
 * data.  There is no guarantee that they will flush it, but this is the
 * best we can do.
 * @param list The freelist from which this bucket should be allocated
 * @return The new bucket, or NULL if allocation failed
 */
APU_DECLARE(apr_bucket *) apr_bucket_flush_create(apr_bucket_alloc_t *list);

How can one gaurantee that the file bucket will actually be flushed
and not held over by a filter?

If it gets to the core output filter, another way to avoid EOS
optimisation is to forcibly set keepalive to AP_CONN_CLOSE, which for
the application concerned is probably reasonable, but is obviously a
bit of a hack.

Finally, my problem only arises because I insert an eos after file
bucket and before calling ap_pass_brigade(). If one uses ap_send_fd(),
it doesn't insert eos before calling ap_pass_brigade(), with something
else obviously later inserting eos. If ap_pass_brigade() is called
without eos first time and only later with eos, will that help in
ensuring it is flushed out, or can same problem still arise.

Is the only way for this to possibly work for Apache to have ownership
of the file descriptor and so be in total control of when it is
closed?

Anyone got any advice?

Thanks.

Graham

Problems with EOS optimisation in ap_core_output_filter() and file buckets.

Reply via email to