Re: close connection for request, but continue
Iosif: You will need: [] a background state storage location (database table with unique row ID; directory with unique ID which points to the state file). [] Your user-facing request page accepts the request, scheduled the work, responds with a page which auto-refreshes against the GUID which reports status of the background request. [] Your user-facing status page auto-refreshes to itself while the job is in motion. [] Your user-facing status page auto-refreshes to a "you're done; your results are here / have been mailed to you". Observations: [] My corporate business users can follow along with a 15 second auto-refresh (as long as the page clearly indicates an auto-refresh in 15 seconds). Count-down timers are probably better. [] My technical users close the pop-up tab after the first request (not caring for the intermediate status pages and knowing that the result has been accomplished or mailed to them). Asides: [] Some of our backend jobs take a long time (lots of data to grind through); these tend towards email status. [] The database-based queue view (assuming you're internally facing only) allows your support teams to observed queued jobs (things which will happen in the future), active jobs (things running right now on some machine), completed jobs (jobs which succeeded), failed jobs (jobs which did not succeed). Hopefully these implementation specifics and operational observations assist you as you take André's excellent summary and put it all to work. -- Frotz EMAN Cisco Systems, Inc. On 2016/04/21, 07:36, "André Warnier (tomcat)"wrote: >On 21.04.2016 11:20, Iosif Fettich wrote: >> Dear mod_perl list, >> >> please consider my gratefulness for any hints/insight :) >> >> I'm trying to achieve the following: when there is an incoming request, >>I want to set a >> time limit in which an answer should be delivered to the client, no >>matter what. >> >> However, since the work triggered by the initial request (there is >>another request to >> other site involved) might take much longer than that time limit, I >>want that work to >> properly finish, despite the fact that the initial request was 'served' >>already. >> > >The "canonical" way to do this, would be something like >- the client sends the request to the server >- the server allocates a process (or thread or whatever) to process this >request >- this request-processing process "delegates" this browser request to >some other, >independent-of-the-webserver process, which can take as long as necessary >to fulfill the >(background part of) the request >- the request-processing process does not wait for the response or the >exit of that >independent process, but returns a response right away to the client >browser (such as >"Thank you for your request. It is being handled by our back-office. You >will receive an >email when it's done.".) >- and then, as far as the webserver is concerned, this client request is >finished >(cleanly), and the request-processing process can be re-allocated to some >other incoming >request > >Optionally, you could provide a way for the client to periodically >enquire as to the >advancement status of his request.
Re: close connection for request, but continue
On 21.04.2016 11:20, Iosif Fettich wrote: Dear mod_perl list, please consider my gratefulness for any hints/insight :) I'm trying to achieve the following: when there is an incoming request, I want to set a time limit in which an answer should be delivered to the client, no matter what. However, since the work triggered by the initial request (there is another request to other site involved) might take much longer than that time limit, I want that work to properly finish, despite the fact that the initial request was 'served' already. [...] In agreement with Perrin, and to expand a bit : To go back some 20 years, let's say that the original design of HTTP and webservers was not really thought for client requests that take a long time to process. Browsers, when they make a request to a server, will wait for a response for a maximum of about 5 minutes, and if by then they have not received a response, will close the connection and display an error like "this server appears to be busy, and does not respond".. And since the connection is now closed, whenever in the end the server would try to send back a response, it would find no connection to send it on, and it would abort the request processing at that point and write some error message to the error log. But you seem to already know all that, which is probably why you are sending a response to the browser no matter what, before this timeout occurs. However, the way in which you are doing this (currently), is kind of a "perversion" of the protocol, because - you are sending a response to the browser saying that everything is ok (so for the browser this request is terminated and it can go on with the next one (and/or close the connection)) - but on the other hand, the request-processing process under Apache is still running, for this request and this client. And if that request-processing process now, for whatever reason, would have something to send to the client (for example, some error), it would find the connection gone and be unable to do it. (And because what you are doing is in fact not a natural thing to do, is the reason why you are not finding any standard module or interface or API to do that kind of thing) The "canonical" way to do this, would be something like - the client sends the request to the server - the server allocates a process (or thread or whatever) to process this request - this request-processing process "delegates" this browser request to some other, independent-of-the-webserver process, which can take as long as necessary to fulfill the (background part of) the request - the request-processing process does not wait for the response or the exit of that independent process, but returns a response right away to the client browser (such as "Thank you for your request. It is being handled by our back-office. You will receive an email when it's done.".) - and then, as far as the webserver is concerned, this client request is finished (cleanly), and the request-processing process can be re-allocated to some other incoming request Optionally, you could provide a way for the client to periodically enquire as to the advancement status of his request. The tricky bit, is to have the Apache-request-processing process in which you are originally, - either itself start a totally independent secondary process that will go off and fulfill the long-running part of the request. Tricky to do right, easy to overwhelm your server. - or (probably simpler), just pass this request to an already-running independent server process which will do this long-running part. This is what Perrin refers to as a "job queue" system. You can develop such a "job queue" system yourself, or you can use an already-made one. There are such things within the Apache projects, or if you want perl, you may find some under CPAN (see POE for example). I would guess that this is all a bit more complicated than what you envisioned initially, but that's the case of many such things.
Re: close connection for request, but continue
A job queue is also better because it stops un-controlled forking or excessive numbers of "dead" web connections hanging around. It will just queue requests until resources are available.. You may find handling multiple of these jobs in parallel eats up all your processor/memory resources.. Where queuing you can limit the number of process running in parallel you have. (and if your site gets bigger you may be able to hand off some of this to a cluster of machines to handle the long running process) On 4/21/2016 3:25 PM, Perrin Harkins wrote: On Thu, Apr 21, 2016 at 9:48 AM, Iosif Fettich> wrote: I'm afraid that won't fit, actually. It's not a typical Cleanup I'm after - I actually want to not abandon the request I've started, just for closing the incoming original request. The cleanup handler could relaunch the slow back request - but doing so I'd pay twice for it. You don't have to. You can just return immediately, and do all the work in the cleanup (or a job queue) while you let the client poll for status. It's a little extra work for simple requests, but it means all requests are handled the same and you never make extra requests to your expensive backend. If you're determined not to do polling from the client, your best bet is probably to fork immediately and do the work in the fork, while you poll to check if it's done in your original process. You'd have to write the response to a database or something that the original process can pick it up from. But forking from mod_perl is a pain and easy to mess up, so I recommend doing one of the other approaches. - Perrin -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
Re: close connection for request, but continue
On Thu, Apr 21, 2016 at 9:48 AM, Iosif Fettichwrote: > I'm afraid that won't fit, actually. It's not a typical Cleanup I'm after > - I actually want to not abandon the request I've started, just for closing > the incoming original request. The cleanup handler could relaunch the slow > back request - but doing so I'd pay twice for it. You don't have to. You can just return immediately, and do all the work in the cleanup (or a job queue) while you let the client poll for status. It's a little extra work for simple requests, but it means all requests are handled the same and you never make extra requests to your expensive backend. If you're determined not to do polling from the client, your best bet is probably to fork immediately and do the work in the fork, while you poll to check if it's done in your original process. You'd have to write the response to a database or something that the original process can pick it up from. But forking from mod_perl is a pain and easy to mess up, so I recommend doing one of the other approaches. - Perrin
Re: close connection for request, but continue
Hi Perrin, I'm trying to achieve the following: when there is an incoming request, I want to set a time limit in which an answer should be delivered to the client, no matter what. However, since the work triggered by the initial request (there is another request to other site involved) might take much longer than that time limit, I want that work to properly finish, despite the fact that the initial request was 'served' already. TMTOWTDI, but the common way to do this is to add the long-running job to a job queue, and then redirect the user to a page that periodically checks if the job is done by using JavaScript requests. It's not such a typical long-running job that I'm doing. It rather goes like this: whereas I most of the time can answer with what I have within the acceptable answer time, I sometimes have to make another request in the background. That too most of the time is served within acceptable time; _sometimes_ it isn't, so only occasionally it takes more. The clue: let's say the backend service is pay-per-use, so I definitely don't want to throw away a started request. If I have launched a request in the back, I'd want to get the results, even if the initial requester was turned down in the meantime. If you don't have a job queue and don't want to add one just for this, you could use a cleanup handler to run the slow stuff after disconnecting: http://perl.apache.org/docs/2.0/user/handlers/http.html#PerlCleanupHandler I'm afraid that won't fit, actually. It's not a typical Cleanup I'm after - I actually want to not abandon the request I've started, just for closing the incoming original request. The cleanup handler could relaunch the slow back request - but doing so I'd pay twice for it. That will tie up a mod_perl process though, so it's not a good way to go for large sites. I'm aware of that, but that's less of a concern for now. Many thanks, Iosif Fettich
Re: close connection for request, but continue
On Thu, Apr 21, 2016 at 5:20 AM, Iosif Fettichwrote: > > I'm trying to achieve the following: when there is an incoming request, I > want to set a time limit in which an answer should be delivered to the > client, no matter what. > > However, since the work triggered by the initial request (there is another > request to other site involved) might take much longer than that time > limit, I want that work to properly finish, despite the fact that the > initial request was 'served' already. TMTOWTDI, but the common way to do this is to add the long-running job to a job queue, and then redirect the user to a page that periodically checks if the job is done by using JavaScript requests. If you don't have a job queue and don't want to add one just for this, you could use a cleanup handler to run the slow stuff after disconnecting: http://perl.apache.org/docs/2.0/user/handlers/http.html#PerlCleanupHandler That will tie up a mod_perl process though, so it's not a good way to go for large sites. - Perrin
close connection for request, but continue
Dear mod_perl list, please consider my gratefulness for any hints/insight :) I'm trying to achieve the following: when there is an incoming request, I want to set a time limit in which an answer should be delivered to the client, no matter what. However, since the work triggered by the initial request (there is another request to other site involved) might take much longer than that time limit, I want that work to properly finish, despite the fact that the initial request was 'served' already. I first thought that using alarm and closing the connection would just work, so my initial code was somewhat like --- #!/usr/bin/perl use strict; use warnings; use Apache2::RequestUtil; use Apache2::Connection; use Apache2::Const -compile => qw(:common :http ); local our $r = shift; local our $t_start = time(); local $SIG{ALRM} = sub { _force_early_response( $t_start ) }; local our $it_took_too_long = 0; our $alarm_time = 5; # seconds we allow to process an response alarm $alarm_time; # start working for the request # # work done alarm 0 return Apache2::Const::OK; sub _force_early_response { my ($t1) = @_; $it_took_too_long = 1; my $t2 = time(); $r->assbackwards(1); my $response_text = _sorry_but_that_will_be_ready_only_later(); $r->print( "HTTP/1.1 200 OK\n" ."Date: Wed, 20 Apr 2016 10:55:08 GMT\n" ."Server: Apache/2.2.31 (Amazon)\n" ."Content-Type: text/plain; charset=UTF-8\n"); $r->print( "\n$response_text" ); my $c = $r->connection(); my $socket = $c->client_socket; $socket->close(); return; } --- That didn't work - the connection just was not closed. A second attempt succeeded, where I changed the alarm trap to be --- sub _force_early_response { my ($t1) = @_; $it_took_too_long = 1; my $t2 = time(); $r->assbackwards(1); my $response_text = _sorry_but_that_will_be_ready_only_later(); my $content_length = length( $response_text ); $r->print( "HTTP/1.1 200 OK\n" ."Date: Wed, 20 Apr 2016 10:55:08 GMT\n" ."Server: Apache/2.2.31 (Amazon)\n" ."Connection: close\n" ."Content-Type: text/plain; charset=UTF-8\n" ."Content-Length: $content_length\n" ); $r->print( "\n$response_text" ); return; } --- So while I have now something that seems to work OK and does exactly what I wanted, I'm a bit unhappy with the code. I did look around, but I wasn't able to find any mod_perl/library function that would make the 'close connection and go on' code easier. Especially for setting the status line, what I now do by $r->print( "HTTP/1.1 200 OK\n" ) I thought that there would be some ready-made tools available. I actually hoped to get away without having to explicitely write out 'hand-crafted' headers, but so far I wasn't able to find anything for that. Maybe I just didn't look into the right places? I'm running this on an updated Amazon Linux machine, vanilla httpd and mod_perl, $ rpm -qi httpd Name: httpd Version : 2.2.31 Release : 1.7.amzn1 $ rpm -qi mod_perl Name: mod_perl Version : 2.0.7 Release : 7.27.amzn1 Many thanks in advance for any ideas, hints, comments. Iosif Fettich