Re: Connection Pooling / TP Monitor
On Mon, Nov 06, 2000 at 09:19:04PM -0500, Thomas A. Lowery wrote: On Mon, Nov 06, 2000 at 04:19:13PM +, Tim Bunce wrote: On Thu, Nov 02, 2000 at 10:10:09PM -0800, Perrin Harkins wrote: Tim Bunce wrote: You could have a set of apache servers that are 'pure' DBI proxy servers. That is, they POST requests containing SQL (for prepare_cached) plus bind parameter values and return responses containing the results. Basically I'm proposing that apache be used as an alternative framework for DBI::ProxyServer. Almost all the marshaling code and higher level logic is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do and you'd gain in all sorts of ways. I think this is a really good idea. The thing is, any effort put into this kind of thing right now feels like a throw away, because mod_perl 2.0 will solve the problem in the right way with real pooling of database handles (and other objects) between threads. Maybe it's time for DBD:: authors to start checking their code for thread safety? Yeap. How about an explaination on how to test a pure perl driver for thread safety and/or what types of code we need to check for or look into? I'd hope that the Apache 2 docs would include a section on thread safety and how to check/change old code. Tim.
Re: Connection Pooling / TP Monitor
What I really dislike on this discussion is, that it mixes two topics that are, IMO, really different: - Using a pool of database connections from an application, typically a threaded application. - Accessing a database connection which really lives on another machine via some sort of remote access protocol; SOAP, Corba have been mentioned in the discussion, the DBI Proxy is using a more lightweight protocol (from the view of Perl) implemented by RPC::PlServer and RPC::PlClient In the past Tim Bunce has proposed to implement the pool via a special DBI driver, DBD::Pool say. For example this could be done via a DSN like DBI:Pool:pool_maxwaiting=10;dsn=DBI:Oracle:trueDsn much in the style of DBD::Proxy. I still believe that this approach has incredible advantages, in particular - it solves the problem that the connection obtained from the pool has to match certain settings; for example the last user of the connection might have set RaiseError = 1 altough the initial value was different. A DBD::Pool driver could quite well be aware of the initial value and possible changes - it would allow to turn the pool on or off by simply changing the DSN - any portable DBI application could make use of such a pool What remains is the question of using a remote connection. First of all, note that DBD::Pool would allow to maintain the pool on both sides of the remote connection: It could be on the server side (which is carrying the actual database connection) and on the client side. For example, the latter would make sense in the case of Apache. Bye, Jochen
Re: Connection Pooling / TP Monitor
Tim Bunce wrote: You could have a set of apache servers that are 'pure' DBI proxy servers. That is, they POST requests containing SQL (for prepare_cached) plus bind parameter values and return responses containing the results. Basically I'm proposing that apache be used as an alternative framework for DBI::ProxyServer. Almost all the marshaling code and higher level logic is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do and you'd gain in all sorts of ways. I think this is a really good idea. The thing is, any effort put into this kind of thing right now feels like a throw away, because mod_perl 2.0 will solve the problem in the right way with real pooling of database handles (and other objects) between threads. Maybe it's time for DBD:: authors to start checking their code for thread safety? - Perrin
Re: Connection Pooling / TP Monitor
At 01:01 PM 10/30/2000 -0600, Leslie Mikesell wrote: According to Gunther Birznieks: I guess part of the question is what is meant by "balanced" with regard to the non-apache back-end servers that was mentioned? I'd be very happy with either a weighted round-robin or a least-connections choice. When the numbers get to the point where it matters, pure statistics is good enough for me. But, I love what you can do with mod_rewrite and would like an easy way to point the target of a match at an arbitrary set of back end servers. Mod_jserv has a nice configuration setting for multiple back ends where you name the set and weight each member. If mod_proxy and/or mod_backhand had a similar concept with the group name being usable as a target for mod_rewrite and ProxyPass it would be easy to use. I think Matt's idea of creating a Location handler and rewriting to the location would work as long as the modules are loaded in the right order, but it would make the configuration somewhat confusing. Mod_Rewrite supports this by writing a custom script to do it (not easy). Mod_Backhand does it natively but is a bit harder than you describe for mod_jserv. With mod_backhand, you basically can set up IP Addresses/Hostnames which can be weeded out for different things. eg some servers only get load balanced during the day because at night they are sending SPAM mail. (Theo's example not mine!) I am also concerned that the original question brings up the notion of failover. mod_backhand is not a failover solution. Backhand does have some facilities to do some failover (eg ByAge weeding) but it's not failover in the traditional sense. Backhand is for load balance not failover. Does it do something sensible if one of the targets does not accept the connection, or does it start sending them all to that one because it isn't busy? Mod_jserv claims to mark that connection dead for a while and moves on to another backend so you have a small delay, not a failure. After a configurable timeout it will try the failing one again. There are no connections ever marked as dead. THey are only marked as not having checked in with their status. If they don't check in with the status within 20-30 seconds, mod_backhand takes them off the list using the ByAge candicacy function. If it comes back up again and reports its status, then it will come back into the list. While Matt is correct that you could probably write your own load balance function, the main interesting function in mod_backhand is ByLoad which as far as I know is Apache specific and relies on the Apache scoreboard (or a patched version of this) The problem of writing your own is that it needs to be in the lightweight server - thus all in C. My understanding is that Apache::Backhand for mod_perl exists. So you could theoretically reserve mod_perl on your front end server just for writing backhand logic. Since you can preload the handlers and they would presumably be small, tight code, it wouldn't have the same effect as running an application on the front-end. But you're right. C is the preferred method. Non apache servers won't have this scoreboard file although perhaps you could program your own server(s) to emulate one if it's not mod_backhand. The other requirement that non-apache servers may have for optimal use with mod_backhand is that the load balanced servers may need to report themselves to the main backhand server as one of the important functions is ByAge to weed out downed servers (and servers too heavily loaded to report their latest stats). If a failed connection would set the status as 'down' and periodic retries checked again, this would take care of itself. Yes, but I don't think mod_backhand does this. At least it's not described in the docs. That's why it's considered load balancing not a failover solution. However, it is my belief that some simple things like this do belong in backhand. My caveat to this opinion is that I do not use backhand and am not an expert into the reasons for design decisions. Otherwise, if you need to load balance a set of non-apache servers evenly and don't need ByLoad, you could always just use mod_rewrite with the reverse_proxy/load balancing recipe from Ralf's guide. This solution would get you up and running fast. But the main immediate downside (other than no true *load* balancing) is the lack of keep-alive upgrading. I'll accept randomizing as reasonable balancing as long as I have fine grained control of the URL's I send to each destination. The real problem with the rewrite randomizer is the complete lack of knowlege about dead backend servers. I want something that will transparently deal with machines that fail. Yes. mod_rewrite won't do that. But neither will backhand I believe. I am also not sure if mod_log_spread has hooks to work with mod_backhand in particular which would make mod_rewrite load balancing (poor man's load
RE: Connection Pooling / TP Monitor
-Original Message- From: G.W. Haywood [mailto:[EMAIL PROTECTED]] Sent: Sunday, October 29, 2000 6:37 AM To: Gunther Birznieks Cc: [EMAIL PROTECTED] Subject: Re: Connection Pooling / TP Monitor Hi guys, On Mon, 30 Oct 2000, Gunther Birznieks wrote: At 09:24 AM 10/29/00 +, Matt Sergeant wrote: On Sat, 28 Oct 2000, Les Mikesell wrote: Load balncing, failover, etc. Really useful stuff guys, how about when you write messages like this putting in some (full) URIs for reference? Most of the time it isn't immediately necessary, I know, but I'm thinking that it would make it so very easy for Geoff Y to cut and paste into the DIGEST. People who are floundering around looking for the stuff might get a flying start. thanks for thinking of me... already ontop of it, though ;) www.backhand.org at the bottom of the page Theo has his presentations from both ApacheCons... wicked cool stuff... HTH --Geoff 73, Ged.
Re: Connection Pooling / TP Monitor
According to Gunther Birznieks: I guess part of the question is what is meant by "balanced" with regard to the non-apache back-end servers that was mentioned? I'd be very happy with either a weighted round-robin or a least-connections choice. When the numbers get to the point where it matters, pure statistics is good enough for me. But, I love what you can do with mod_rewrite and would like an easy way to point the target of a match at an arbitrary set of back end servers. Mod_jserv has a nice configuration setting for multiple back ends where you name the set and weight each member. If mod_proxy and/or mod_backhand had a similar concept with the group name being usable as a target for mod_rewrite and ProxyPass it would be easy to use. I think Matt's idea of creating a Location handler and rewriting to the location would work as long as the modules are loaded in the right order, but it would make the configuration somewhat confusing. I am also concerned that the original question brings up the notion of failover. mod_backhand is not a failover solution. Backhand does have some facilities to do some failover (eg ByAge weeding) but it's not failover in the traditional sense. Backhand is for load balance not failover. Does it do something sensible if one of the targets does not accept the connection, or does it start sending them all to that one because it isn't busy? Mod_jserv claims to mark that connection dead for a while and moves on to another backend so you have a small delay, not a failure. After a configurable timeout it will try the failing one again. While Matt is correct that you could probably write your own load balance function, the main interesting function in mod_backhand is ByLoad which as far as I know is Apache specific and relies on the Apache scoreboard (or a patched version of this) The problem of writing your own is that it needs to be in the lightweight server - thus all in C. Non apache servers won't have this scoreboard file although perhaps you could program your own server(s) to emulate one if it's not mod_backhand. The other requirement that non-apache servers may have for optimal use with mod_backhand is that the load balanced servers may need to report themselves to the main backhand server as one of the important functions is ByAge to weed out downed servers (and servers too heavily loaded to report their latest stats). If a failed connection would set the status as 'down' and periodic retries checked again, this would take care of itself. Otherwise, if you need to load balance a set of non-apache servers evenly and don't need ByLoad, you could always just use mod_rewrite with the reverse_proxy/load balancing recipe from Ralf's guide. This solution would get you up and running fast. But the main immediate downside (other than no true *load* balancing) is the lack of keep-alive upgrading. I'll accept randomizing as reasonable balancing as long as I have fine grained control of the URL's I send to each destination. The real problem with the rewrite randomizer is the complete lack of knowlege about dead backend servers. I want something that will transparently deal with machines that fail. I am also not sure if mod_log_spread has hooks to work with mod_backhand in particular which would make mod_rewrite load balancing (poor man's load balancing) less desirable. I suspect mod_log_spread is not backhand-specific although made by the same group but having not played with this module yet, I couldn't say for sure. If you can run everything through a single front end apache you can use that as the 'real' log. There is some point where this scheme would not handle the load and you would need one of the connection oriented balancers instead of a proxy, but a fairly ordinary pentium should be able to saturate an ethernet or two if it is just fielding static files and proxying the rest. You would also need a fail-over mechanism for the front end box, but this could be a simple IP takeover and there are some programs available for that. Les Mikesell [EMAIL PROTECTED]
Re: Connection Pooling / TP Monitor
I guess part of the question is what is meant by "balanced" with regard to the non-apache back-end servers that was mentioned? I am also concerned that the original question brings up the notion of failover. mod_backhand is not a failover solution. Backhand does have some facilities to do some failover (eg ByAge weeding) but it's not failover in the traditional sense. Backhand is for load balance not failover. While Matt is correct that you could probably write your own load balance function, the main interesting function in mod_backhand is ByLoad which as far as I know is Apache specific and relies on the Apache scoreboard (or a patched version of this) Non apache servers won't have this scoreboard file although perhaps you could program your own server(s) to emulate one if it's not mod_backhand. The other requirement that non-apache servers may have for optimal use with mod_backhand is that the load balanced servers may need to report themselves to the main backhand server as one of the important functions is ByAge to weed out downed servers (and servers too heavily loaded to report their latest stats). Otherwise, if you need to load balance a set of non-apache servers evenly and don't need ByLoad, you could always just use mod_rewrite with the reverse_proxy/load balancing recipe from Ralf's guide. This solution would get you up and running fast. But the main immediate downside (other than no true *load* balancing) is the lack of keep-alive upgrading. I am also not sure if mod_log_spread has hooks to work with mod_backhand in particular which would make mod_rewrite load balancing (poor man's load balancing) less desirable. I suspect mod_log_spread is not backhand-specific although made by the same group but having not played with this module yet, I couldn't say for sure. At 09:24 AM 10/29/00 +, Matt Sergeant wrote: On Sat, 28 Oct 2000, Les Mikesell wrote: Is there any way to tie proxy requests mapped by mod_rewrite to a balanced set of servers through mod_backhand (or anything similar)?Also, can mod_backhand (or any alternative) work with non-apache back end servers?I'm really looking for a way to let mod_rewrite do the first cut at deciding where (or whether) to send a request, but then be able to send to a load balanced, fail over set, preferably without having to interpose another physical proxy. From what I could make out, I think you should be able to use backhand only within certain Location sections, and therefore have a request come in outside of that, rewrite it to a backhand enabled location and have backhand do its thing. Should work, but you'd have to try it. Alternatively write your own decision module for backhand. There's even a mod_perl module to do so (although apparently it needs patching for the latest version of mod_backhand). -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\ __ Gunther Birznieks ([EMAIL PROTECTED]) eXtropia - The Web Technology Company http://www.extropia.com/
Re: Connection Pooling / TP Monitor
Hi guys, On Mon, 30 Oct 2000, Gunther Birznieks wrote: At 09:24 AM 10/29/00 +, Matt Sergeant wrote: On Sat, 28 Oct 2000, Les Mikesell wrote: Load balncing, failover, etc. Really useful stuff guys, how about when you write messages like this putting in some (full) URIs for reference? Most of the time it isn't immediately necessary, I know, but I'm thinking that it would make it so very easy for Geoff Y to cut and paste into the DIGEST. People who are floundering around looking for the stuff might get a flying start. 73, Ged.
Re: Connection Pooling / TP Monitor
Gunther Birznieks [EMAIL PROTECTED] writes: I am also concerned that the original question brings up the notion of failover. mod_backhand is not a failover solution. Backhand does have some facilities to do some failover (eg ByAge weeding) but it's not failover in the traditional sense. Backhand is for load balance not failover. Are we talking about failing "out" a server that's lost the plot, or bringing a new server "in" as well? Isn't it just a case of defaulting the apparent load of a failed machine up really high (like infinite)? -- Dave Hodgkinson, http://www.hodgkinson.org Editor-in-chief, The Highway Star http://www.deep-purple.com Apache, mod_perl, MySQL, Sybase hired gun for, well, hire -
Re: Connection Pooling / TP Monitor
At 12:21 PM 10/29/00 +, David Hodgkinson wrote: Gunther Birznieks [EMAIL PROTECTED] writes: I am also concerned that the original question brings up the notion of failover. mod_backhand is not a failover solution. Backhand does have some facilities to do some failover (eg ByAge weeding) but it's not failover in the traditional sense. Backhand is for load balance not failover. Are we talking about failing "out" a server that's lost the plot, or bringing a new server "in" as well? Isn't it just a case of defaulting the apparent load of a failed machine up really high (like infinite)? This question gets into the realm of stuff that I am not really well qualified to answer. However, I think the way it works is that there are several candicacy functions that slowly wittle down the list of servers to direct a given request to. The simulation of a failed machine defaulting to infinite load is a bit odd in mod_backhand for a couple reasons. 1) The ByLoad candicacy function relies on resource information having been broadcasted by potential backhand destinations not something that is collected by the backhand origin. Should a backhand destination server go down, it will not broadcast itself and ByLoad will not know the resource update. In my experience, few servers ever know they are going down before something catastrophic happens. They may complain about something but they don't know it's going down. Of course, there are cases when a machine knows it is on the doomed list, but I would argue that this a rare case unfortunately. In other words, the way mod_backhand's ByLoad function works would require mod_psychic to be compiled as well. :) 2) This then leads to the natural thing that you were probably thinking (?)... which is that ByLoad might end up pinging the destination server to make sure it is up before distributing the load to it. Unfortunately, I don't think that this is in ByLoad. Or at least it's not documented at http://www.backhand.org/mod_backhand/ Also, Theo's slide http://www.backhand.org/ApacheCon2000/EU/img4.htm explicitly x'ed out the fail-over part of mod_backhand as a solution. However, the question is at what point the ping candicacy function would be written. If you write it too early, you waste time pinging all the servers. If you write it too late, you might have too few machines to test. Let's go through an example of this.. Destination Servers 1,2,3,4,5,6,7,8 are mod_backhand'ed... Let's assume that the load is lowest on the lowest # server and highest on the highest number server. A reasonable example of candicacy functions are the following: ByAge, ByRandom, ByLog, ByLoad. Let's assume that servers 5-8 have just gone down because someone decided to purchase one big UPS for all 4 servers instead of separate ones ,and the UPS just burned out and also shorted out the power when this happened causing all 4 servers to go down. So let's say 5 seconds have gone by with requests.. 1. ByAge says that they are all responded within the last 20 seconds (this is the default)... Now, this provides some fail over but 20 seconds can be a long time for a server to be down and not weeded out. In this case, 5 seconds has gone by and all 8 are seen by backhand as being up. 2. ByRandom randomizes the list (1,2,3,4,5,6,7,8) let's say this become 8,6,5,1,3,2,6,4 3. ByLog strips everything but the first log2(n) servers (where n is the number of elements in the list). Thus, for 8 elements, we get 3 now. 8,6,5 4. ByLoad checks out the load and then distrubutes it to 5 which is the lowest load. But whoops..., 5 is down. Remember 5-8 went down. Now it would be smart to build a ping into ByLoad but that still wouldn't help because actually 8,6,5 that are left after step 3 are all down too. You also can't write a ByPing candicacy function that starts out because it basically means every request generates pings to every server asking if they are working which would be quite intense and it would defeat the performance advantage of the multicast broadcast of status data. The moral is that to be more accurate mod_backhand actually have to build something into the candicacy function to tell it to start all candicacy functions over from scratch and wipe that server off the list. If all the servers are down, then there's nothing to be done. But at least one will be up and this should be the chosen one. However, my understanding from the mod_backhand talk and the documentation is that fail over is not an issue that is discussed as a goal of mod_backhand and that there are other products to recommend such as Alteon/BIGip/whatever switches or other such fail over products. Anyway, I think that to some degree it does make sense that within the context of the original mod_backhand server distributing the connections, there should be some fail over for the destinations to back up the ByAge function at the very end of all the candicacy function
Re: Connection Pooling / TP Monitor
On Sat, 28 Oct 2000, Les Mikesell wrote: Is there any way to tie proxy requests mapped by mod_rewrite to a balanced set of servers through mod_backhand (or anything similar)?Also, can mod_backhand (or any alternative) work with non-apache back end servers?I'm really looking for a way to let mod_rewrite do the first cut at deciding where (or whether) to send a request, but then be able to send to a load balanced, fail over set, preferably without having to interpose another physical proxy. From what I could make out, I think you should be able to use backhand only within certain Location sections, and therefore have a request come in outside of that, rewrite it to a backhand enabled location and have backhand do its thing. Should work, but you'd have to try it. Alternatively write your own decision module for backhand. There's even a mod_perl module to do so (although apparently it needs patching for the latest version of mod_backhand). -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Re: Connection Pooling / TP Monitor
- Original Message - From: "Matt Sergeant" [EMAIL PROTECTED] . To redirect incoming url's that require database work to mod_perl 'heavy' servers? Just like a smarter and more dynamic mod_rewrite? Yes? Yes basically, except its not a redirect. mod_backhand can use keep-alives to ensure that it never has to recreate a new connection to the heavy backend servers, unlike mod_rewrite or mod_proxy. And it can do it in a smart way so that remote connections don't use keepalives (because they are evil for mod_perl servers - see the mod_perl guide), but backhand connections do. Very very cool technology. Is there any way to tie proxy requests mapped by mod_rewrite to a balanced set of servers through mod_backhand (or anything similar)?Also, can mod_backhand (or any alternative) work with non-apache back end servers?I'm really looking for a way to let mod_rewrite do the first cut at deciding where (or whether) to send a request, but then be able to send to a load balanced, fail over set, preferably without having to interpose another physical proxy. Les Mikesell [EMAIL PROTECTED]
Re: Connection Pooling / TP Monitor
"Tim" == Tim Bunce [EMAIL PROTECTED] writes: Tim You could have a set of apache servers that are 'pure' DBI proxy Tim servers. That is, they POST requests containing SQL (for Tim prepare_cached) plus bind parameter values and return responses Tim containing the results. Tim Basically I'm proposing that apache be used as an alternative Tim framework for DBI::ProxyServer. Almost all the marshaling code Tim and higher level logic is already in DBI::ProxyServer and Tim DBD::Proxy. Shouldn't be too hard to do and you'd gain in all Tim sorts of ways. You could also use SOAP or SOAP::Lite as the interface. Most of that code seems ready for this kind of application already. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 [EMAIL PROTECTED] URL:http://www.stonehenge.com/merlyn/ Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
Re: Connection Pooling / TP Monitor
On 27 Oct 2000, (Randal L. Schwartz) wrote: "Tim" == Tim Bunce [EMAIL PROTECTED] writes: Tim You could have a set of apache servers that are 'pure' DBI proxy Tim servers. That is, they POST requests containing SQL (for Tim prepare_cached) plus bind parameter values and return responses Tim containing the results. Tim Basically I'm proposing that apache be used as an alternative Tim framework for DBI::ProxyServer. Almost all the marshaling code Tim and higher level logic is already in DBI::ProxyServer and Tim DBD::Proxy. Shouldn't be too hard to do and you'd gain in all Tim sorts of ways. You could also use SOAP or SOAP::Lite as the interface. Most of that code seems ready for this kind of application already. There are some issues still with this architecture, the primary one is that SOAP is too heavy weight for anything that seriously needs connection pooling for speed issues, especially in Perl (due to the XML parsing speed issues). -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Re: Connection Pooling / TP Monitor
The only way I really see this working is in a threading environment. First of all, for some databases database connections don't survive forking (Oracle is the notable example here). Also, even if we could get forking to work, we would still get the scaling problem we are trying to avoid. Instead of Oracle keeping a huge list of persistent connections, the Connection Pool would keep a huge list of persistent connections. In both cases each connection would map to a Unix process and all these processes would chew up OS resources big time! - Original Message - From: "Matt Sergeant" [EMAIL PROTECTED] To: "Tim Bunce" [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; "Jeff Horn" [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Friday, October 27, 2000 7:02 AM Subject: Re: Connection Pooling / TP Monitor On Fri, 27 Oct 2000, Tim Bunce wrote: Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using CORBA::ORBit? Maybe. I dunno. I don't actually need this stuff, I just want there to be a solution out there for those that do. I'm waving my hands around and pointing in various directions hoping someone will _do_ something! Hehe... OK, lets think about exactly what is needed here then. I figure Doug's Apache::DBIPool module (for mod_perl 2.0) is exactly the right architecture: 2 pools of connections (Busy and Waiting) New connections always taken from the head of Waiting Finished connections always replaced on the head of Waiting Threaded architecture (DBI::Oracle handles don't survive a fork) One thread for management One thread per connection once a handle has been supplied Some sort of timeout mechanism for connections if the pool is fully allocated Anything I've missed? If we don't go the threaded route, we can't easily expand and contract the connection pool I don't think - but I'd love to be proved wrong. Also an entire Apache server for the connection pool would be too much - the pre-forking server from the cookbook would be better. And it should even work on Win32 now... -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Re: Connection Pooling / TP Monitor
On Fri, 27 Oct 2000, Jeff Horn wrote: The only way I really see this working is in a threading environment. First of all, for some databases database connections don't survive forking (Oracle is the notable example here). Also, even if we could get forking to work, we would still get the scaling problem we are trying to avoid. Instead of Oracle keeping a huge list of persistent connections, the Connection Pool would keep a huge list of persistent connections. In both cases each connection would map to a Unix process and all these processes would chew up OS resources big time! I don't think thats a 100% fair analysis. First off, forking doesn't use any more RAM than threading until you start copying data. And I don't see that much of a scaling issue - we'd have a pool of connections available, each as a child. Since you could control the number of connections produced you wouldn't have to worry about the scalability issues too much - the problems with things like mod_perl are that there is one connection per child, regardless of what that child is doing. A pool would allow you to make that a lot cleaner, and less resource intensive. At least I think so... Unfortunately I don't have a need for this right now, so I'm not willing to put the hacking tuits into it. Sorry :-( -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Re: Connection Pooling / TP Monitor
On Fri, 27 Oct 2000, Tim Bunce wrote: On Thu, Oct 26, 2000 at 08:47:20PM +0100, Matt Sergeant wrote: On Tue, 24 Oct 2000, Jeff Horn wrote: However, I am also aware of a _major_ ISP that implements their email system using a _major_ RDBMS that has had problems that are best solved via connection pooling. Essentially, the time it takes them to search through all the cached connections is nearly as long as the time it is taking to read/write to the database. Although, I'm not implementing email as this ISP is, I think that scalability in my case may definitely run into similar roadblocks. I am interested in hearing from anyone that has tried to implement true connection pooling either within Apache or as an external process. I'm particularly interested in hearing about implementations that could be made to work or are done using Perl and DBI/DBD. I am mostly interested in things that are Open Source or licensed like Perl itself. Having just returned from ApacheCon, I can honestly recommend looking at mod_backhand to simply have a few servers that run the DBI pool, and have database intensive requests go to those servers. It is a *very* cool solution to just these sorts of scalability problems. To redirect incoming url's that require database work to mod_perl 'heavy' servers? Just like a smarter and more dynamic mod_rewrite? Yes? Yes basically, except its not a redirect. mod_backhand can use keep-alives to ensure that it never has to recreate a new connection to the heavy backend servers, unlike mod_rewrite or mod_proxy. And it can do it in a smart way so that remote connections don't use keepalives (because they are evil for mod_perl servers - see the mod_perl guide), but backhand connections do. Very very cool technology. Or, here's an odd thought that just crossed my mind... You could have a set of apache servers that are 'pure' DBI proxy servers. That is, they POST requests containing SQL (for prepare_cached) plus bind parameter values and return responses containing the results. Basically I'm proposing that apache be used as an alternative framework for DBI::ProxyServer. Almost all the marshaling code and higher level logic is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do and you'd gain in all sorts of ways. Anyone fancy having a go? Let me know so we can discuss it in more detail. Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using CORBA::ORBit? -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Re: Connection Pooling / TP Monitor
On Fri, Oct 27, 2000 at 12:26:44PM +0100, Matt Sergeant wrote: Or, here's an odd thought that just crossed my mind... You could have a set of apache servers that are 'pure' DBI proxy servers. That is, they POST requests containing SQL (for prepare_cached) plus bind parameter values and return responses containing the results. Basically I'm proposing that apache be used as an alternative framework for DBI::ProxyServer. Almost all the marshaling code and higher level logic is already in DBI::ProxyServer and DBD::Proxy. Shouldn't be too hard to do and you'd gain in all sorts of ways. Anyone fancy having a go? Let me know so we can discuss it in more detail. Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using CORBA::ORBit? Maybe. I dunno. I don't actually need this stuff, I just want there to be a solution out there for those that do. I'm waving my hands around and pointing in various directions hoping someone will _do_ something! :-) Tim.
Re: Connection Pooling / TP Monitor
At 03:41 PM 10/27/00 +0100, Matt Sergeant wrote: On 27 Oct 2000, (Randal L. Schwartz) wrote: "Tim" == Tim Bunce [EMAIL PROTECTED] writes: Tim You could have a set of apache servers that are 'pure' DBI proxy Tim servers. That is, they POST requests containing SQL (for Tim prepare_cached) plus bind parameter values and return responses Tim containing the results. Tim Basically I'm proposing that apache be used as an alternative Tim framework for DBI::ProxyServer. Almost all the marshaling code Tim and higher level logic is already in DBI::ProxyServer and Tim DBD::Proxy. Shouldn't be too hard to do and you'd gain in all Tim sorts of ways. You could also use SOAP or SOAP::Lite as the interface. Most of that code seems ready for this kind of application already. There are some issues still with this architecture, the primary one is that SOAP is too heavy weight for anything that seriously needs connection pooling for speed issues, especially in Perl (due to the XML parsing speed issues). What we did for our SOAP objects is that we don't use XML Parsing exactly. The reality is that SOAP parsing with a generic object library is heavy weight. But if you are supporting only 7-8 method calls, it's really not bad to write regex that can do all the appropriate parsing with very little code. Our SOAP drivers have well-defined interfaces and are extremely fast using IO::Socket, Regex SOAP utility methods (to construct and strip things like envelope headers). You might argue that there are more things to go wrong this way and you would be right. However, method calls are usually quite well defined and always have the same basic parameter definitions. So as long as you stick to the well-defined interfaces, it's not bad in practice.
Re: Connection Pooling / TP Monitor
On Fri, 27 Oct 2000, Tim Bunce wrote: Sounds like just a CORBA/RPC type thing. Wouldn't you be better off using CORBA::ORBit? Maybe. I dunno. I don't actually need this stuff, I just want there to be a solution out there for those that do. I'm waving my hands around and pointing in various directions hoping someone will _do_ something! Hehe... OK, lets think about exactly what is needed here then. I figure Doug's Apache::DBIPool module (for mod_perl 2.0) is exactly the right architecture: 2 pools of connections (Busy and Waiting) New connections always taken from the head of Waiting Finished connections always replaced on the head of Waiting Threaded architecture (DBI::Oracle handles don't survive a fork) One thread for management One thread per connection once a handle has been supplied Some sort of timeout mechanism for connections if the pool is fully allocated Anything I've missed? If we don't go the threaded route, we can't easily expand and contract the connection pool I don't think - but I'd love to be proved wrong. Also an entire Apache server for the connection pool would be too much - the pre-forking server from the cookbook would be better. And it should even work on Win32 now... -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Re: Connection Pooling / TP Monitor
I would second that. We've done this using SOAP. We have a DataSource::SOAP driver that acts as a lightweight interface to a Jakarta TomCat server for the DB stuff. We get the benefits of Perl on the front-end and Java DB Connection pooling logic/proxying on the middle tier. Of course I guess you could do the SOAP Server in Perl too, but Java was a bit easier because we also get built in shared memory caching for frequently issued queries with the way our particular interfaces work. Anyway, with SOAP it doesn't matter what language you use for what -- in theory. Later, Gunther At 07:38 AM 10/27/00 -0700, Randal L. Schwartz wrote: "Tim" == Tim Bunce [EMAIL PROTECTED] writes: Tim You could have a set of apache servers that are 'pure' DBI proxy Tim servers. That is, they POST requests containing SQL (for Tim prepare_cached) plus bind parameter values and return responses Tim containing the results. Tim Basically I'm proposing that apache be used as an alternative Tim framework for DBI::ProxyServer. Almost all the marshaling code Tim and higher level logic is already in DBI::ProxyServer and Tim DBD::Proxy. Shouldn't be too hard to do and you'd gain in all Tim sorts of ways. You could also use SOAP or SOAP::Lite as the interface. Most of that code seems ready for this kind of application already. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 [EMAIL PROTECTED] URL:http://www.stonehenge.com/merlyn/ Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! __ Gunther Birznieks ([EMAIL PROTECTED]) eXtropia - The Web Technology Company http://www.extropia.com/
Re: Connection Pooling / TP Monitor
On Tue, 24 Oct 2000, Jeff Horn wrote: However, I am also aware of a _major_ ISP that implements their email system using a _major_ RDBMS that has had problems that are best solved via connection pooling. Essentially, the time it takes them to search through all the cached connections is nearly as long as the time it is taking to read/write to the database. Although, I'm not implementing email as this ISP is, I think that scalability in my case may definitely run into similar roadblocks. I am interested in hearing from anyone that has tried to implement true connection pooling either within Apache or as an external process. I'm particularly interested in hearing about implementations that could be made to work or are done using Perl and DBI/DBD. I am mostly interested in things that are Open Source or licensed like Perl itself. Having just returned from ApacheCon, I can honestly recommend looking at mod_backhand to simply have a few servers that run the DBI pool, and have database intensive requests go to those servers. It is a *very* cool solution to just these sorts of scalability problems. PS: I'll have an ApacheCon report "coming soon". -- Matt/ /||** Director and CTO ** //||** AxKit.com Ltd ** ** XML Application Serving ** // ||** http://axkit.org ** ** XSLT, XPathScript, XSP ** // \\| // ** Personal Web Site: http://sergeant.org/ ** \\// //\\ // \\
Connection Pooling / TP Monitor
First let me say that I'm aware that this topic comes up with some frequency on the mod_perl and DBI-users list. I am aware of posts like this one: http:[EMAIL PROTECTED] which argue against the necessity of pooling. However, I am also aware of a _major_ ISP that implements their email system using a _major_ RDBMS that has had problems that are best solved via connection pooling. Essentially, the time it takes them to search through all the cached connections is nearly as long as the time it is taking to read/write to the database. Although, I'm not implementing email as this ISP is, I think that scalability in my case may definitely run into similar roadblocks. I am interested in hearing from anyone that has tried to implement true connection pooling either within Apache or as an external process. I'm particularly interested in hearing about implementations that could be made to work or are done using Perl and DBI/DBD. I am mostly interested in things that are Open Source or licensed like Perl itself. I am aware of a project called Gnu Transaction Server (GTS), but it doesn't seem like this is quite ready for prime time at the moment or is even under active development. I've seen posts that hint at using shared memory and IPC to implement this within Apache as well as posts that hint at possibilities of implementing this using DBI::Proxy. I basically want to do what the big TP monitors (Tuxedo/Encina/CICS) do with respect to condensing connections to a database, but I'm not in need of features like two-phase commit, cross database joins, heterogeneous database environment, etc. incorporated in these products. Even if you'd simply be interested in working on such a project, I'd like to hear from you. If you think such a project is plain stupid, I'd also be interested in hearing from you (but be gentle!). If you already have something sort of working along these lines, I'd DEFINITELY be interested in hearing from you! -- Jeff Horn
Re: Connection Pooling / TP Monitor
On Tue, Oct 24, 2000 at 03:09:47PM -0500, Jeff Horn wrote: I basically want to do what the big TP monitors (Tuxedo/Encina/CICS) do with respect to condensing connections to a database, but I'm not in need of features like two-phase commit, cross database joins, heterogeneous database environment, etc. incorporated in these products. I think there's lots more mileage to be had from developing DBI::ProxyServer further. Tim. p.s. Jeff, please keep me CC'd on any dialogue. Thanks.