RE: Clustered Maven?
This approach would work and would provide failover. It wouldn't handle the load balancing, but it would be very fast to roll out. If used one or two build machines, I think I'd take this approach. Round robin would be good enough to keep both machines busy. From: Rick Mangi [mailto:[EMAIL PROTECTED] Sent: Wed 2/23/2005 5:07 PM To: Maven Users List Subject: Re: Clustered Maven? Jared, Didn't think you were being dismissive at all. > Our existing solution has a cluster of build machines that provide > very nice failover, so the feature is "expected". Suggesting we look > at a build system with a dozen boxes, with each one being a point of > failure, wouldn't go over well. Given cascading build failure issues, > the wrong box dying could take out (literally) hundreds of builds. > I would approach this the same way I approach a web server farm. Primary/Secondary. The odds of a build machine blowing up are pretty low. Just assign each build a secondary failover and if you can't ping the machine, send the job to the secondary machine. A more robust environment would run some sanity check on the box before assigning the build task. Rick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Clustered Maven?
Jared, Didn't think you were being dismissive at all. Our existing solution has a cluster of build machines that provide very nice failover, so the feature is "expected". Suggesting we look at a build system with a dozen boxes, with each one being a point of failure, wouldn't go over well. Given cascading build failure issues, the wrong box dying could take out (literally) hundreds of builds. I would approach this the same way I approach a web server farm. Primary/Secondary. The odds of a build machine blowing up are pretty low. Just assign each build a secondary failover and if you can't ping the machine, send the job to the secondary machine. A more robust environment would run some sanity check on the box before assigning the build task. Rick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Clustered Maven?
> True, but does it matter that much for a build? I mean the > build is taking > source and producing something that can be scratched and > rebuilt any time. > Even if a machine fails you haven't lost anything. True, but... :) If a machine fails at 2 am, you haven't lost code but you've lost (potentially) a weekly build that a small army of developers and testers are planning to use at 7am. Is getting the weekly build (or the nightly build) out the door reliably worth having enough people on call with enough expertise to replace the box, replicate the environment and get the builds running again? I'd rather address the problem in software (if possible). For me, it's an issue of robustness. Does it work every time? If you are depending on a commodity PC hard drive or power supply, the answer is no. Then you have to face the issue of whether or not your customers (in this case, the development community) is losing faith in your ability to deliver product... But I digress. Our existing solution has a cluster of build machines that provide very nice failover, so the feature is "expected". Suggesting we look at a build system with a dozen boxes, with each one being a point of failure, wouldn't go over well. Given cascading build failure issues, the wrong box dying could take out (literally) hundreds of builds. > > > > > To give you a little more background, I'm thinking about a > several hundred > > developers, 300 projects (plus installers) and about 5 > millions lines of > > code. (SAS is big). http://www.pragmaticautomation.com/cgi- > > bin/pragauto.cgi/Build/CCOnALargeScale.rdoc > > Note: I can't access the doc (access is forbidden). > I can access it. H... Try just hitting http://www.pragmaticautomation.com and then search on Jared or SAS. The blog entry is titled "CruiseControl on a Large Scale". > I think that with such a system having a perfect load-balancer (i.e. > automatic load balancing) would be more important than the fact that a > machine could crash (which would lead to nothing being lost > as everything > can always be rebuilt - You could even save logs for whatever > is put in the > build log queue and the result of the build so that you know > which builds > have not been processed. You could also have the machine that > takes a build > starts by logging the build in a log file so that you could > replay the build > if it crashes, etc). > I think we agree on this point. Do you agree? :) > > > > We could put builds 1 through 17 on a single machine, 18 > through 30 on > > another, > > I think having a "reverse" load balancer would improve a lot > the overall > efficiency of the build farm. If you put builds 1 through 17 > on one machine > and those builds happen to be simple builds or with not a lot > of developers > or whatever the machine will be under-used. If these builds > are heavy then > they the projects won't get built as often as they could, etc. > > That said, I'm not an expert in this domain so I'd love to be > proved wrong > and learn something in the process! :-) > I'm not following you. What do you mean by reverse load balancer? > > etc, but if/when a single machine crashes, the recovery time > > would become a real issue. > > It is if you have a controller that controls to which to load balance > because it'll have to understand that the machine should be > removed from the > farm and not been given any job whereas with the "reverse" > load balancer > there's no logic to implement for this. > > > Also, you can't load balance this way. Someone > > would end up "tuning" the load to get projects that run in > parallel off of > > the same machine. > > I don't understand this point. :) If you have builds assigned to a given machine (and only that machine) then a person will have to move builds from machine to machine if you want to efficiently use the hardware. Random (or alphabetical) assignments will not be an effecive load spreading mechanism. > > > > > As to "knowing the state", if the proxy/manager issues the > job and is > > notified (return code? Good log file?) of the job's completion, it's > > pretty easy to keep track of state. CPU, ram, etc becomes a > side issue if > > you just issue one build at a time to each box in the > cluster. Once a box > > is finished, you send another job. The faster boxes process > more builds > > and the slow ones process less. Automatic load balancing. > > Yes, true. The hard part is knowing to which machine to send > the next build > job so you need get some answer from the build machines. And > you need to > modify the scheduler if a new machine is added. > > But yes, I think both solutions have pros and cons. I haven't really > implemented the "reverse" load balancer solution but I have > always found it > extremely elegant in term of architecture. It seems to me it > has more pros > than cons but maybe the devil is in the implementation details... :-)
RE: Clustered Maven?
> -Original Message- > From: Rick Mangi [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 23, 2005 4:26 PM > To: Maven Users List > Subject: Re: Clustered Maven? > > FYI - the build system I described was for PV-WAVE > (www.vni.com) circa > 1996. The builds ran on a dozen flavors of unix, vax and open vms. We > did partial builds (certain components and not necessarily automated) > on QNX, Win NT, NextStep, Linux (slackware 0.x i think) and MacOS 8 > (IIRC). It worked great. No single failure would take down > the rest of > the builds (why would they?). This code was easily 1,000,000 > lines. C, > C++, Fortran. A similar process was used to build IMSL back then as > well. > > Shell scripts, rsh, expect, etc. aren't sexy. But sometimes > they're the > best tool for the job. I'm not knocking any of the other tools > mentioned, I haven't used them. They might be great. > > Rick > True. Sorry if I sounded dismissive. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Clustered Maven?
> -Original Message- > From: Jared Richardson [mailto:[EMAIL PROTECTED] > Sent: mercredi 23 février 2005 22:01 > To: Maven Users List > Subject: RE: Clustered Maven? > [snip] > > > > From an architecture standpoint, I would prefer to let the > > different build > > machines grab a build job. It's hard for a central point to > > know the state > > of the different build machines whereas the machine itself is > > the best to > > know its state. This allows to mix machines with different CPUs, RAM, > > processors, etc. > > You get no failover this way. If machine X dies, the builds can't run > until someone restores the machine. That is not acceptable in our > environment. True, but does it matter that much for a build? I mean the build is taking source and producing something that can be scratched and rebuilt any time. Even if a machine fails you haven't lost anything. > > To give you a little more background, I'm thinking about a several hundred > developers, 300 projects (plus installers) and about 5 millions lines of > code. (SAS is big). http://www.pragmaticautomation.com/cgi- > bin/pragauto.cgi/Build/CCOnALargeScale.rdoc Note: I can't access the doc (access is forbidden). I think that with such a system having a perfect load-balancer (i.e. automatic load balancing) would be more important than the fact that a machine could crash (which would lead to nothing being lost as everything can always be rebuilt - You could even save logs for whatever is put in the build log queue and the result of the build so that you know which builds have not been processed. You could also have the machine that takes a build starts by logging the build in a log file so that you could replay the build if it crashes, etc). > > We could put builds 1 through 17 on a single machine, 18 through 30 on > another, I think having a "reverse" load balancer would improve a lot the overall efficiency of the build farm. If you put builds 1 through 17 on one machine and those builds happen to be simple builds or with not a lot of developers or whatever the machine will be under-used. If these builds are heavy then they the projects won't get built as often as they could, etc. That said, I'm not an expert in this domain so I'd love to be proved wrong and learn something in the process! :-) > etc, but if/when a single machine crashes, the recovery time > would become a real issue. It is if you have a controller that controls to which to load balance because it'll have to understand that the machine should be removed from the farm and not been given any job whereas with the "reverse" load balancer there's no logic to implement for this. > Also, you can't load balance this way. Someone > would end up "tuning" the load to get projects that run in parallel off of > the same machine. I don't understand this point. > > As to "knowing the state", if the proxy/manager issues the job and is > notified (return code? Good log file?) of the job's completion, it's > pretty easy to keep track of state. CPU, ram, etc becomes a side issue if > you just issue one build at a time to each box in the cluster. Once a box > is finished, you send another job. The faster boxes process more builds > and the slow ones process less. Automatic load balancing. Yes, true. The hard part is knowing to which machine to send the next build job so you need get some answer from the build machines. And you need to modify the scheduler if a new machine is added. But yes, I think both solutions have pros and cons. I haven't really implemented the "reverse" load balancer solution but I have always found it extremely elegant in term of architecture. It seems to me it has more pros than cons but maybe the devil is in the implementation details... :-) > > > > > > See > > http://blogs.codehaus.org/people/vmassol/archives/000937_unbre > > akable_builds. > > html to see what I mean (forget the unbreakable part as this > > is not our > > topic here - just look at the build queue and the machines). > > > Thanks. I had actually read that beofore. Your Binary Dependency Builds > entry > (http://blogs.codehaus.org/people/vmassol/archives/000953_binary_dependenc > y_builds.html) is what convinced me that Maven might be able to handle a > setup this big. Hey that's cool! I hope you'll like it... At my last project we had quite a big team too (hundred developers). The only thing that needs to be stressed is that the build has to have strong automated unit and functional tests. Thanks -Vincent _ Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en français ! Yahoo! Mail : http://fr.mail.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Clustered Maven?
FYI - the build system I described was for PV-WAVE (www.vni.com) circa 1996. The builds ran on a dozen flavors of unix, vax and open vms. We did partial builds (certain components and not necessarily automated) on QNX, Win NT, NextStep, Linux (slackware 0.x i think) and MacOS 8 (IIRC). It worked great. No single failure would take down the rest of the builds (why would they?). This code was easily 1,000,000 lines. C, C++, Fortran. A similar process was used to build IMSL back then as well. Shell scripts, rsh, expect, etc. aren't sexy. But sometimes they're the best tool for the job. I'm not knocking any of the other tools mentioned, I haven't used them. They might be great. Rick On Feb 23, 2005, at 4:01 PM, Jared Richardson wrote: -Original Message- From: Rick Mangi [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 23, 2005 3:23 PM To: Maven Users List Subject: Re: Clustered Maven? Well no, not necessarily. You could just put the project descriptors on each machine so maven would know which SCM to connect to for each project. It would then download src just for the projects you are building on each machine. I'm not sure what the minimum requirements are for this in terms of which files you need, but I imagine just project.xml, maven.xml and project.properties. This is where build.properties comes in very handy. You can specify machine specific overrides there. I'd look at the tools Vincent mentions as well. I haven't used any of them myself. Thanks. This looks like the best solution so far. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Clustered Maven?
> -Original Message- > From: Rick Mangi [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 23, 2005 3:23 PM > To: Maven Users List > Subject: Re: Clustered Maven? > > Well no, not necessarily. You could just put the project > descriptors on > each machine so maven would know which SCM to connect to for each > project. It would then download src just for the projects you are > building on each machine. I'm not sure what the minimum requirements > are for this in terms of which files you need, but I imagine just > project.xml, maven.xml and project.properties. This is where > build.properties comes in very handy. You can specify machine > specific > overrides there. > > I'd look at the tools Vincent mentions as well. I haven't used any of > them myself. > > Thanks. This looks like the best solution so far. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Clustered Maven?
Replied inline... > -Original Message- > From: Vincent Massol [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 23, 2005 3:16 PM > To: 'Maven Users List' > Subject: RE: Clustered Maven? > > Hi Jared, > > No this is something that would sit on top of Maven. > Currently Maven does > not provide the continuous build loop. For this you can use > CruiseControl, > Gump, DamageControl, etc. Some of these tools support the build queue > concept. In the future Maven will have its own continuous > build (it's in > development and called Continuum). We are using CruiseControl already. Works great... I'm hoping to drive Maven from CC. > > From an architecture standpoint, I would prefer to let the > different build > machines grab a build job. It's hard for a central point to > know the state > of the different build machines whereas the machine itself is > the best to > know its state. This allows to mix machines with different CPUs, RAM, > processors, etc. You get no failover this way. If machine X dies, the builds can't run until someone restores the machine. That is not acceptable in our environment. To give you a little more background, I'm thinking about a several hundred developers, 300 projects (plus installers) and about 5 millions lines of code. (SAS is big). http://www.pragmaticautomation.com/cgi-bin/pragauto.cgi/Build/CCOnALargeScale.rdoc We could put builds 1 through 17 on a single machine, 18 through 30 on another, etc, but if/when a single machine crashes, the recovery time would become a real issue. Also, you can't load balance this way. Someone would end up "tuning" the load to get projects that run in parallel off of the same machine. As to "knowing the state", if the proxy/manager issues the job and is notified (return code? Good log file?) of the job's completion, it's pretty easy to keep track of state. CPU, ram, etc becomes a side issue if you just issue one build at a time to each box in the cluster. Once a box is finished, you send another job. The faster boxes process more builds and the slow ones process less. Automatic load balancing. > > See > http://blogs.codehaus.org/people/vmassol/archives/000937_unbre > akable_builds. > html to see what I mean (forget the unbreakable part as this > is not our > topic here - just look at the build queue and the machines). Thanks. I had actually read that beofore. Your Binary Dependency Builds entry (http://blogs.codehaus.org/people/vmassol/archives/000953_binary_dependency_builds.html) is what convinced me that Maven might be able to handle a setup this big. Thanks! -Jared - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Clustered Maven?
Well no, not necessarily. You could just put the project descriptors on each machine so maven would know which SCM to connect to for each project. It would then download src just for the projects you are building on each machine. I'm not sure what the minimum requirements are for this in terms of which files you need, but I imagine just project.xml, maven.xml and project.properties. This is where build.properties comes in very handy. You can specify machine specific overrides there. I'd look at the tools Vincent mentions as well. I haven't used any of them myself. On Feb 23, 2005, at 2:52 PM, Jared Richardson wrote: In this scenario, all the code for every project would exist on every machine? -Original Message- From: Rick Mangi [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 23, 2005 2:07 PM To: Maven Users List Subject: Re: Clustered Maven? Not really a newbie question at all... actually a very interesting question. I don't believe there is any built in functionality for this but you could definitely do it with a set of shell scripts and rsh. Many many years ago I did a similar thing with make files, rsh to spawn the builds onto the remote machines and expect to analyze the output. It worked amazingly well spawning nightly builds on a dozen different OS flavors, correlating the output into a master build report. Rick On Feb 23, 2005, at 1:48 PM, Jared Richardson wrote: Hi all, Sorry if this a complete newbie question. I've perused some of the docs and Googled and I'm not seeing an answer. I am interested in setting up a group of Maven boxes that can all build a set of projects. I'd like to have a front-end proxy/manager accept build requests and farm them out to one of the Maven boxes. This type of configuration gives you a level of failover. Does Maven have a similar capability? If it doesn't, could you trick it by using a common network share to hold the local workspaces? This would make all the local files available to both machines? If the proxy/manager were smart enough to not issue build requests for the same project to multiple machines, would Maven stomp on itself? Thanks! Jared - Jared Richardson [EMAIL PROTECTED] 919-531-9136 http://www.sas.com SAS... The Power to Know(r) - "The plan is nothing; the planning is everything." Dwight Eisenhower - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Clustered Maven?
Hi Jared, No this is something that would sit on top of Maven. Currently Maven does not provide the continuous build loop. For this you can use CruiseControl, Gump, DamageControl, etc. Some of these tools support the build queue concept. In the future Maven will have its own continuous build (it's in development and called Continuum). >From an architecture standpoint, I would prefer to let the different build machines grab a build job. It's hard for a central point to know the state of the different build machines whereas the machine itself is the best to know its state. This allows to mix machines with different CPUs, RAM, processors, etc. See http://blogs.codehaus.org/people/vmassol/archives/000937_unbreakable_builds. html to see what I mean (forget the unbreakable part as this is not our topic here - just look at the build queue and the machines). Thanks -Vincent > -Original Message- > From: Jared Richardson [mailto:[EMAIL PROTECTED] > Sent: mercredi 23 février 2005 19:49 > To: Maven Users List > Subject: Clustered Maven? > > Hi all, > > Sorry if this a complete newbie question. I've perused some of the docs > and Googled and I'm not seeing an answer. > > I am interested in setting up a group of Maven boxes that can all build a > set of projects. I'd like to have a front-end proxy/manager accept build > requests and farm them out to one of the Maven boxes. > > This type of configuration gives you a level of failover. Does Maven have > a similar capability? > > If it doesn't, could you trick it by using a common network share to hold > the local workspaces? This would make all the local files available to > both machines? If the proxy/manager were smart enough to not issue build > requests for the same project to multiple machines, would Maven stomp on > itself? > > Thanks! > > Jared > > - > Jared Richardson > [EMAIL PROTECTED] > 919-531-9136 > http://www.sas.com > SAS... The Power to Know(r) > - > > > "The plan is nothing; the planning is everything." > > Dwight Eisenhower > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > _ Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en français ! Yahoo! Mail : http://fr.mail.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Clustered Maven?
In this scenario, all the code for every project would exist on every machine? > -Original Message- > From: Rick Mangi [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 23, 2005 2:07 PM > To: Maven Users List > Subject: Re: Clustered Maven? > > Not really a newbie question at all... actually a very interesting > question. > > I don't believe there is any built in functionality for this but you > could definitely do it with a set of shell scripts and rsh. Many many > years ago I did a similar thing with make files, rsh to spawn the > builds onto the remote machines and expect to analyze the output. It > worked amazingly well spawning nightly builds on a dozen different OS > flavors, correlating the output into a master build report. > > Rick > > > > On Feb 23, 2005, at 1:48 PM, Jared Richardson wrote: > > > Hi all, > > > > Sorry if this a complete newbie question. I've perused some of the > > docs and Googled and I'm not seeing an answer. > > > > I am interested in setting up a group of Maven boxes that can all > > build a set of projects. I'd like to have a front-end proxy/manager > > accept build requests and farm them out to one of the Maven boxes. > > > > This type of configuration gives you a level of failover. > Does Maven > > have a similar capability? > > > > If it doesn't, could you trick it by using a common network > share to > > hold the local workspaces? This would make all the local files > > available to both machines? If the proxy/manager were smart > enough to > > not issue build requests for the same project to multiple machines, > > would Maven stomp on itself? > > > > Thanks! > > > > Jared > > > > - > > Jared Richardson > > [EMAIL PROTECTED] > > 919-531-9136 > > http://www.sas.com > > SAS... The Power to Know(r) > > - > > > > > > "The plan is nothing; the planning is everything." > > > > Dwight Eisenhower > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Clustered Maven?
Not really a newbie question at all... actually a very interesting question. I don't believe there is any built in functionality for this but you could definitely do it with a set of shell scripts and rsh. Many many years ago I did a similar thing with make files, rsh to spawn the builds onto the remote machines and expect to analyze the output. It worked amazingly well spawning nightly builds on a dozen different OS flavors, correlating the output into a master build report. Rick On Feb 23, 2005, at 1:48 PM, Jared Richardson wrote: Hi all, Sorry if this a complete newbie question. I've perused some of the docs and Googled and I'm not seeing an answer. I am interested in setting up a group of Maven boxes that can all build a set of projects. I'd like to have a front-end proxy/manager accept build requests and farm them out to one of the Maven boxes. This type of configuration gives you a level of failover. Does Maven have a similar capability? If it doesn't, could you trick it by using a common network share to hold the local workspaces? This would make all the local files available to both machines? If the proxy/manager were smart enough to not issue build requests for the same project to multiple machines, would Maven stomp on itself? Thanks! Jared - Jared Richardson [EMAIL PROTECTED] 919-531-9136 http://www.sas.com SAS... The Power to Know(r) - "The plan is nothing; the planning is everything." Dwight Eisenhower - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]