On 21.04.2016 11:20, Iosif Fettich wrote:
Dear mod_perl list,
please consider my gratefulness for any hints/insight :)
I'm trying to achieve the following: when there is an incoming request, I want
to set a
time limit in which an answer should be delivered to the client, no matter what.
However, since the work triggered by the initial request (there is another
request to
other site involved) might take much longer than that time limit, I want that
work to
properly finish, despite the fact that the initial request was 'served' already.
[...]
In agreement with Perrin, and to expand a bit :
To go back some 20 years, let's say that the original design of HTTP and webservers was
not really thought for client requests that take a long time to process.
Browsers, when they make a request to a server, will wait for a response for a maximum of
about 5 minutes, and if by then they have not received a response, will close the
connection and display an error like "this server appears to be busy, and does not respond"..
And since the connection is now closed, whenever in the end the server would try to send
back a response, it would find no connection to send it on, and it would abort the request
processing at that point and write some error message to the error log.
But you seem to already know all that, which is probably why you are sending a response to
the browser no matter what, before this timeout occurs.
However, the way in which you are doing this (currently), is kind of a "perversion" of the
protocol, because
- you are sending a response to the browser saying that everything is ok (so for the
browser this request is terminated and it can go on with the next one (and/or close the
connection))
- but on the other hand, the request-processing process under Apache is still running, for
this request and this client.
And if that request-processing process now, for whatever reason, would have something to
send to the client (for example, some error), it would find the connection gone and be
unable to do it.
(And because what you are doing is in fact not a natural thing to do, is the reason why
you are not finding any standard module or interface or API to do that kind of thing)
The "canonical" way to do this, would be something like
- the client sends the request to the server
- the server allocates a process (or thread or whatever) to process this request
- this request-processing process "delegates" this browser request to some other,
independent-of-the-webserver process, which can take as long as necessary to fulfill the
(background part of) the request
- the request-processing process does not wait for the response or the exit of that
independent process, but returns a response right away to the client browser (such as
"Thank you for your request. It is being handled by our back-office. You will receive an
email when it's done.".)
- and then, as far as the webserver is concerned, this client request is finished
(cleanly), and the request-processing process can be re-allocated to some other incoming
request
Optionally, you could provide a way for the client to periodically enquire as to the
advancement status of his request.
The tricky bit, is to have the Apache-request-processing process in which you
are originally,
- either itself start a totally independent secondary process that will go off and fulfill
the long-running part of the request. Tricky to do right, easy to overwhelm your server.
- or (probably simpler), just pass this request to an already-running independent server
process which will do this long-running part.
This is what Perrin refers to as a "job queue" system.
You can develop such a "job queue" system yourself, or you can use an already-made one.
There are such things within the Apache projects, or if you want perl, you may find some
under CPAN (see POE for example).
I would guess that this is all a bit more complicated than what you envisioned initially,
but that's the case of many such things.