Re: [RFC] Do Not Run Everything on One mod_perl Server
> > Actually in my experience the sharing of memory doesn't work as well > > as one would hope. While compiling perl allocates memory for code > > and data (variables) from the same memory pools, so code and > > variables are interlaced. Over the lifetime of a apache/mod_perl > > child a lot of memory pages are promoted from shared to unshared that > > contain mostly code and one or two variables... If someone with more > > knowledge of the perl internals were to change that, this would make > > a huge difference for mod_perl users and everybody else writing > > daemons in perl that spawn many children. Well... the structs for the "Code Values" are mixed with both code, and variables in the case of Lexicals. However, during runtime these are not altered. Their are structs within the main code value struct that will be altered during run time, namely the recursive lexical array. (That's what I call it, but I'm sure Malcolm uses a more corect word :->) However, the actual Code (a series of highly optimized opcodes... instruction set workalike type stuff that take a few clockcycles each to do depending on the op and the architecture) is not memory inline with the structs for the recursive lexical array. What I'm saying is that if you include all your code at the very begining the design of perl will not alter that code, so it should be allowed to be fixed and shared. Basically it just holds an opcode pointer as to which opcode its working at within the CV. The recursive lexical array itself just has a pointer within the "code value" struct to itself. So basically that main Code struct should never need to be realloc'd so it's fairly unlikely that it would need to be non-shared. However maybe someone that understands how something is "unshared" within the kernel could be quite helpful. If you were to change where something was pointing within a struct would that cause it to be unshared? I think that it's fairly unlikely, but I suppose it's possible. If that's the case then it's quite likely that code pieces could become unshared I suppose. However the main hunk of actual function opcodes would remain fixed, only the execution pointer (where it's pointing at within the present program) would change. So, in final (!) the code should always be shared. However if you change the file and it checks the date on it and reloads it, obviously it won't be shared :-). > The Perl is a language that uses weak data types, i.e. you don't specify > variable size (type) like you do in the strong typed languages like C. > Therefore Perl uses heap memory by allocating memory on demand, rather > (unmodifiable) text and (modifiable) data memory pages, used by C. The > latter get allocated when the program is loaded into memory before it > starts to run, the former is allocated at the run-time. Yes that's true. There is some compile time stuff where it organizes the variable names within the lexical array, but I'm not sure whether or not it actually reserves space for those things at that time. I'm really not sure about that item. > On heap there is no separation for data and text pages, when you call > malloc it just allocates you a chunk you have asked for, this leads to the > case where the code (static in most cases) and the variables (dynamic) > land on the same memory pages. This is a weak area in my knowledge. I'm not certain how the kernel actually marks segments as shared and not... so I'll refrain from commenting. > I'm not sure this a very good explanation. Perl gurus are very welcome to > correct/improve my attempt to explain this. I've tryed to explain what I can. The best book for this is "Advanced Perl Programming"... published by Oreilly. (of course) There is a chapter in there written by Malcolm Beatie (well pieces of the chapter) that are pretty good..., but I'm afraid they might not go into enough depth on these exact issues. Not only that, but these are also very kernel related too..., you have to understand how both pieces fit together, and frankly I couldn't answer that, and I don't know a person alive that could :-). (I'm sure there are some, but who?) Thanks, Shane. > > But the main point is that that's how Perl is written, and I don't know > whether it can be changed. > > __ > Stas Bekman | JAm_pH--Just Another mod_perl Hacker > http://stason.org/ | mod_perl Guide http://perl.apache.org/guide > mailto:[EMAIL PROTECTED] | http://perl.orghttp://stason.org/TULARC/ > http://singlesheaven.com| http://perlmonth.com http://sourcegarden.org > -- > > >
Re: [RFC] Do Not Run Everything on One mod_perl Server
At 01:44 AM 4/20/00 +0300, Stas Bekman wrote: >On Wed, 19 Apr 2000, Gerd Knops wrote: > > > Stas Bekman wrote: > > > On Wed, 19 Apr 2000, Matt Carothers wrote: > > > > On Tue, 18 Apr 2000, Stas Bekman wrote: > > > > > > > > > Let's assume that you have two different sets of scripts/code > > > > > which have a little or nothing in common at all (different > > > > > modules, no base code sharing), the basic mod_perl process > > > > > before the code have been loaded of three Mbytes and each code > > > > > base adds ten Mbytes when loaded. Which makes each process 23Mb > > > > > in size when all the code gets loaded. > > > > > > > > Can't you share most of that 23mb between the processes by > > > > pre-loading the scripts/modules in your startup.pl? I'd say the > > > > main advantage of engineering dissimilar services as if they were > > > > on separate servers is scalability rather than memory use. When a > > > > site outgrows the hardware it's on, spreading it out to multiple > > > > machines requires a lot less ankle grabbing if it was designed > > > > that way to begin with. :) > > > > > > Geez, I always forget something :( > > > > > > You are right. I forgot to mention that this was a scenario for the > > > 23 Mb of unshared memory. I just wanted to give an example. Still > > > somehow I'm almost sure that there are servers where even with > > > sharing in place, the hypothetical scenario I've presented is quite > > > possible. > > > > > > Anyway, it's just another patent for squeezing some more juice from > > > your hardware without upgrading it. > > > > > > But, sure I'll add the correction about the sharing memory which > > > drastically changes the story :) > > > > > Actually in my experience the sharing of memory doesn't work as well > > as one would hope. While compiling perl allocates memory for code > > and data (variables) from the same memory pools, so code and > > variables are interlaced. Over the lifetime of a apache/mod_perl > > child a lot of memory pages are promoted from shared to unshared that > > contain mostly code and one or two variables... If someone with more > > knowledge of the perl internals were to change that, this would make > > a huge difference for mod_perl users and everybody else writing > > daemons in perl that spawn many children. > >The Perl is a language that uses weak data types, i.e. you don't specify >variable size (type) like you do in the strong typed languages like C. >Therefore Perl uses heap memory by allocating memory on demand, rather >(unmodifiable) text and (modifiable) data memory pages, used by C. The >latter get allocated when the program is loaded into memory before it >starts to run, the former is allocated at the run-time. > >On heap there is no separation for data and text pages, when you call >malloc it just allocates you a chunk you have asked for, this leads to the >case where the code (static in most cases) and the variables (dynamic) >land on the same memory pages. > >I'm not sure this a very good explanation. Perl gurus are very welcome to >correct/improve my attempt to explain this. > >But the main point is that that's how Perl is written, and I don't know >whether it can be changed. That's how I understand it to be. But I could be wrong as well. :) By the way, I think your section here is actually far-thinking. Although Matt and others have pointed out that this scenario may not be the biggest memory booster in the world in some cases, there is another consideration that I think should be mentioned. Reliability and Troubleshooting. Right now, I dare say that most of you probably either are doing custom mod_perl coding or are using an infrastructural tools such as EmbPerl, ASP etc... In these scenarios, it is relatively easy to troubleshoot your code because you either [a] wrote it all or [b] you are running code on top of a well-tested Apache::Mod_perl infrastructural application toolkit. This is, I suspect, because as far as real-world open source applications are concerned, mod_perl is a bit behind. Thousands of open source CGI scripts exist in Perl for plain CGI. Few (if any) are in a repository to work with mod_perl off the bat. However, I see this changing. Efforts such as mine, SmartWorker's etc... will eventually lead to a proliferation of another layer of infrastructural component which is above the Tool level (EmbPerl/Apache::Session/etc). In other words, a component level that is at the application level -- plug and play calendars, bbses, web shopping carts, etc. If people reach this point on mod_perl, you may find that some modules are not written as well as others and so subtle side effects may be introduced when you start throwing everyone's code together in one huge vat of mod_perl. If this happens, I suspect it will be a lot easier to troubleshoot problems that occur if you keep major application suites separate from each other. eg Don't run SmartWorker on the same server as EmbPerl or the same ser
Re: [RFC] Do Not Run Everything on One mod_perl Server
My apache processes are typically 18MB-20MB in size, with all but 500K to 1MB of that shared. We restart our servers in the middle of the nite as part of planned maintenance, of course, but even before we did that, and even after weeks of uptime, the percentages did not change. We do not use Apache::Registry at all; everything is a pure handler. We cache all data structures (lots of storable things) in the parent process by thawing refs to the datastructures into package variables. We use no globals, only a few package variables (4) that we access by fully qualified package name, and they get reset on each request. We use Apache::DBI and MySQL, and it works perfectly other than a few segfaults that occur once in a while. Having all of the data structures cached (and shared !) allows us to do some neat things without having to rely solely on sql. On Thu, 20 Apr 2000, Stas Bekman wrote: > On Wed, 19 Apr 2000, Joshua Chamas wrote: > > > Stas Bekman wrote: > > > > > > Geez, I always forget something :( > > > > > > You are right. I forgot to mention that this was a scenario for the 23 Mb > > > of unshared memory. I just wanted to give an example. Still somehow I'm > > > almost sure that there are servers where even with sharing in place, the > > > hypothetical scenario I've presented is quite possible. > > > > > > Anyway, it's just another patent for squeezing some more juice from your > > > hardware without upgrading it. > > > > > > > Your scenario would be more believable with 5M unshared, even > > after doing ones best to share everything. This is pretty typical > > when connecting to databases, as the database connections cannot > > be shared, and especially DB's like Oracle take lots of RAM > > per connection. > > Good idea. 5MB sounds closer to the real case than 10Mb. I'll make the > correction. Thanks!!! > > > I'm not sure that your scenario is worthwhile if someone does > > a good job preloading / sharing code across the forks, and > > the difference will really be how much of the code gets dirty > > while you run things, which can be neatly tuned with MaxRequests. > > Agree. But not everybody knows to do that well. So the presented idea > might still find a good use at some web shops. > > > Interesting & novel approach though. I would bet that if people > > went down this path, they would really end up on different machines > > per web application, or even different web clusters per application ;) > > :) > > > __ > Stas Bekman | JAm_pH--Just Another mod_perl Hacker > http://stason.org/ | mod_perl Guide http://perl.apache.org/guide > mailto:[EMAIL PROTECTED] | http://perl.orghttp://stason.org/TULARC/ > http://singlesheaven.com| http://perlmonth.com http://sourcegarden.org > -- > >
Re: [RFC] Do Not Run Everything on One mod_perl Server
On Wed, 19 Apr 2000, Gerd Knops wrote: > Stas Bekman wrote: > > On Wed, 19 Apr 2000, Matt Carothers wrote: > > > On Tue, 18 Apr 2000, Stas Bekman wrote: > > > > > > > Let's assume that you have two different sets of scripts/code > > > > which have a little or nothing in common at all (different > > > > modules, no base code sharing), the basic mod_perl process > > > > before the code have been loaded of three Mbytes and each code > > > > base adds ten Mbytes when loaded. Which makes each process 23Mb > > > > in size when all the code gets loaded. > > > > > > Can't you share most of that 23mb between the processes by > > > pre-loading the scripts/modules in your startup.pl? I'd say the > > > main advantage of engineering dissimilar services as if they were > > > on separate servers is scalability rather than memory use. When a > > > site outgrows the hardware it's on, spreading it out to multiple > > > machines requires a lot less ankle grabbing if it was designed > > > that way to begin with. :) > > > > Geez, I always forget something :( > > > > You are right. I forgot to mention that this was a scenario for the > > 23 Mb of unshared memory. I just wanted to give an example. Still > > somehow I'm almost sure that there are servers where even with > > sharing in place, the hypothetical scenario I've presented is quite > > possible. > > > > Anyway, it's just another patent for squeezing some more juice from > > your hardware without upgrading it. > > > > But, sure I'll add the correction about the sharing memory which > > drastically changes the story :) > > > Actually in my experience the sharing of memory doesn't work as well > as one would hope. While compiling perl allocates memory for code > and data (variables) from the same memory pools, so code and > variables are interlaced. Over the lifetime of a apache/mod_perl > child a lot of memory pages are promoted from shared to unshared that > contain mostly code and one or two variables... If someone with more > knowledge of the perl internals were to change that, this would make > a huge difference for mod_perl users and everybody else writing > daemons in perl that spawn many children. The Perl is a language that uses weak data types, i.e. you don't specify variable size (type) like you do in the strong typed languages like C. Therefore Perl uses heap memory by allocating memory on demand, rather (unmodifiable) text and (modifiable) data memory pages, used by C. The latter get allocated when the program is loaded into memory before it starts to run, the former is allocated at the run-time. On heap there is no separation for data and text pages, when you call malloc it just allocates you a chunk you have asked for, this leads to the case where the code (static in most cases) and the variables (dynamic) land on the same memory pages. I'm not sure this a very good explanation. Perl gurus are very welcome to correct/improve my attempt to explain this. But the main point is that that's how Perl is written, and I don't know whether it can be changed. __ Stas Bekman | JAm_pH--Just Another mod_perl Hacker http://stason.org/ | mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] | http://perl.orghttp://stason.org/TULARC/ http://singlesheaven.com| http://perlmonth.com http://sourcegarden.org --
Re: [RFC] Do Not Run Everything on One mod_perl Server
Stas Bekman wrote: > > Geez, I always forget something :( > > You are right. I forgot to mention that this was a scenario for the 23 Mb > of unshared memory. I just wanted to give an example. Still somehow I'm > almost sure that there are servers where even with sharing in place, the > hypothetical scenario I've presented is quite possible. > > Anyway, it's just another patent for squeezing some more juice from your > hardware without upgrading it. > Your scenario would be more believable with 5M unshared, even after doing ones best to share everything. This is pretty typical when connecting to databases, as the database connections cannot be shared, and especially DB's like Oracle take lots of RAM per connection. I'm not sure that your scenario is worthwhile if someone does a good job preloading / sharing code across the forks, and the difference will really be how much of the code gets dirty while you run things, which can be neatly tuned with MaxRequests. Interesting & novel approach though. I would bet that if people went down this path, they would really end up on different machines per web application, or even different web clusters per application ;) -- Joshua _ Joshua Chamas Chamas Enterprises Inc. NodeWorks >> free web link monitoring Huntington Beach, CA USA http://www.nodeworks.com1-714-625-4051
Re: [RFC] Do Not Run Everything on One mod_perl Server
On Wed, 19 Apr 2000, Joshua Chamas wrote: > Stas Bekman wrote: > > > > Geez, I always forget something :( > > > > You are right. I forgot to mention that this was a scenario for the 23 Mb > > of unshared memory. I just wanted to give an example. Still somehow I'm > > almost sure that there are servers where even with sharing in place, the > > hypothetical scenario I've presented is quite possible. > > > > Anyway, it's just another patent for squeezing some more juice from your > > hardware without upgrading it. > > > > Your scenario would be more believable with 5M unshared, even > after doing ones best to share everything. This is pretty typical > when connecting to databases, as the database connections cannot > be shared, and especially DB's like Oracle take lots of RAM > per connection. Good idea. 5MB sounds closer to the real case than 10Mb. I'll make the correction. Thanks!!! > I'm not sure that your scenario is worthwhile if someone does > a good job preloading / sharing code across the forks, and > the difference will really be how much of the code gets dirty > while you run things, which can be neatly tuned with MaxRequests. Agree. But not everybody knows to do that well. So the presented idea might still find a good use at some web shops. > Interesting & novel approach though. I would bet that if people > went down this path, they would really end up on different machines > per web application, or even different web clusters per application ;) :) __ Stas Bekman | JAm_pH--Just Another mod_perl Hacker http://stason.org/ | mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] | http://perl.orghttp://stason.org/TULARC/ http://singlesheaven.com| http://perlmonth.com http://sourcegarden.org --
Re: [RFC] Do Not Run Everything on One mod_perl Server
Stas Bekman wrote: > On Wed, 19 Apr 2000, Matt Carothers wrote: > > On Tue, 18 Apr 2000, Stas Bekman wrote: > > > > > Let's assume that you have two different sets of scripts/code > > > which have a little or nothing in common at all (different > > > modules, no base code sharing), the basic mod_perl process > > > before the code have been loaded of three Mbytes and each code > > > base adds ten Mbytes when loaded. Which makes each process 23Mb > > > in size when all the code gets loaded. > > > > Can't you share most of that 23mb between the processes by > > pre-loading the scripts/modules in your startup.pl? I'd say the > > main advantage of engineering dissimilar services as if they were > > on separate servers is scalability rather than memory use. When a > > site outgrows the hardware it's on, spreading it out to multiple > > machines requires a lot less ankle grabbing if it was designed > > that way to begin with. :) > > Geez, I always forget something :( > > You are right. I forgot to mention that this was a scenario for the > 23 Mb of unshared memory. I just wanted to give an example. Still > somehow I'm almost sure that there are servers where even with > sharing in place, the hypothetical scenario I've presented is quite > possible. > > Anyway, it's just another patent for squeezing some more juice from > your hardware without upgrading it. > > But, sure I'll add the correction about the sharing memory which > drastically changes the story :) > Actually in my experience the sharing of memory doesn't work as well as one would hope. While compiling perl allocates memory for code and data (variables) from the same memory pools, so code and variables are interlaced. Over the lifetime of a apache/mod_perl child a lot of memory pages are promoted from shared to unshared that contain mostly code and one or two variables... If someone with more knowledge of the perl internals were to change that, this would make a huge difference for mod_perl users and everybody else writing daemons in perl that spawn many children. Gerd
Re: [RFC] Do Not Run Everything on One mod_perl Server
On Wed, 19 Apr 2000, Matt Carothers wrote: > On Tue, 18 Apr 2000, Stas Bekman wrote: > > > Let's assume that you have two different sets of scripts/code which > > have a little or nothing in common at all (different modules, no base > > code sharing), the basic mod_perl process before the code have been > > loaded of three Mbytes and each code base adds ten Mbytes when > > loaded. Which makes each process 23Mb in size when all the code gets > > loaded. > > Can't you share most of that 23mb between the processes by pre-loading > the scripts/modules in your startup.pl? I'd say the main advantage of > engineering dissimilar services as if they were on separate servers is > scalability rather than memory use. When a site outgrows the hardware > it's on, spreading it out to multiple machines requires a lot less ankle > grabbing if it was designed that way to begin with. :) Geez, I always forget something :( You are right. I forgot to mention that this was a scenario for the 23 Mb of unshared memory. I just wanted to give an example. Still somehow I'm almost sure that there are servers where even with sharing in place, the hypothetical scenario I've presented is quite possible. Anyway, it's just another patent for squeezing some more juice from your hardware without upgrading it. But, sure I'll add the correction about the sharing memory which drastically changes the story :) Thanks, Matt! __ Stas Bekman | JAm_pH--Just Another mod_perl Hacker http://stason.org/ | mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] | http://perl.orghttp://stason.org/TULARC/ http://singlesheaven.com| http://perlmonth.com http://sourcegarden.org --
Re: [RFC] Do Not Run Everything on One mod_perl Server
On Tue, 18 Apr 2000, Stas Bekman wrote: > Let's assume that you have two different sets of scripts/code which > have a little or nothing in common at all (different modules, no base > code sharing), the basic mod_perl process before the code have been > loaded of three Mbytes and each code base adds ten Mbytes when > loaded. Which makes each process 23Mb in size when all the code gets > loaded. Can't you share most of that 23mb between the processes by pre-loading the scripts/modules in your startup.pl? I'd say the main advantage of engineering dissimilar services as if they were on separate servers is scalability rather than memory use. When a site outgrows the hardware it's on, spreading it out to multiple machines requires a lot less ankle grabbing if it was designed that way to begin with. :) - Matt
Re: [RFC] Do Not Run Everything on One mod_perl Server
> "ELB" == Eric L Brine <[EMAIL PROTECTED]> writes: ELB> It used to be one process for everything, or at least one application for ELB> everything. Then mod_perl comes in and people have started using a tiered ELB> system (plain server + mod_perl server). Now you're talking about ELB> individual application servers. Someday, maybe the script will load the ELB> server instead of the other way around! I think the word "FastCGI" comes to mind... ;-)
Re: [RFC] Do Not Run Everything on One mod_perl Server
It used to be one process for everything, or at least one application for everything. Then mod_perl comes in and people have started using a tiered system (plain server + mod_perl server). Now you're talking about individual application servers. Someday, maybe the script will load the server instead of the other way around! use mod_perl (8080, 3); # (port, #processes) Sorry, feeling philosophical this morning. Thanks for the report. ELB -- Eric L. Brine | Chicken: The egg's way of making more eggs. [EMAIL PROTECTED] | Do you always hit the nail on the thumb? ICQ# 4629314 | An optimist thinks thorn bushes have roses.