Re: server uptime
I guess a better question at this point is then, how much uptime does it take before a server begins to ask, "What am I?", and wishes to meet its creator Thanks for all the discussion about server uptime. The novelty of having a server with 2 years of uptime is much less significant now, and I think I can bring myself to power it down. ;-) - Original Message - From: "Dave Johnson" <[EMAIL PROTECTED]> To: gnhlug-discuss@mail.gnhlug.org Sent: Saturday, March 22, 2008 12:04:40 PM (GMT-0500) America/New_York Subject: re: server uptime Warren Luebkeman writes: > I am curious how common it is for peoples servers to go extremely > long periods of time without crashing/reboot. Our server, running > Debian Sarge, which serves our email/web/backups/dns/etc has been > running 733 days (two years) without a reboot. Its in an 4U IBM > chassis with dual power supplies, which was old when we fired it up > (PIII Server). > > Does anyone have similar uptime on their mission critical servers? > Whats the longest uptime someone has had with Windows? I have a Sharp Zaurus SL-5500 PDA that's been accumulating a rather impressive uptime sitting unused in it's cradle. Just checked and it's now up to 1594 days, however the openzaurus kernel it's running has a 32bit jiffies counter so it's wrapped it's uptime 3 times so far and only shows 103 days, the other 3*497 days are there but hidden :( It's survived many power outages by simply auto-suspending itself if power is lost, just resume back on after the outage and uptime picks up where it left off. I think there's probably another month of missed uptime due to forgetting to resume it after power outages. If only I could find a more useful purpose for it besides accumulating uptime. -- Dave ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ -- Warren Luebkeman Founder, COO Resara LLC 888.357.9195 www.resara.com ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
re: server uptime
Warren Luebkeman writes: > I am curious how common it is for peoples servers to go extremely > long periods of time without crashing/reboot. Our server, running > Debian Sarge, which serves our email/web/backups/dns/etc has been > running 733 days (two years) without a reboot. Its in an 4U IBM > chassis with dual power supplies, which was old when we fired it up > (PIII Server). > > Does anyone have similar uptime on their mission critical servers? > Whats the longest uptime someone has had with Windows? I have a Sharp Zaurus SL-5500 PDA that's been accumulating a rather impressive uptime sitting unused in it's cradle. Just checked and it's now up to 1594 days, however the openzaurus kernel it's running has a 32bit jiffies counter so it's wrapped it's uptime 3 times so far and only shows 103 days, the other 3*497 days are there but hidden :( It's survived many power outages by simply auto-suspending itself if power is lost, just resume back on after the outage and uptime picks up where it left off. I think there's probably another month of missed uptime due to forgetting to resume it after power outages. If only I could find a more useful purpose for it besides accumulating uptime. -- Dave ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Fri, 21 Mar 2008 20:08:44 -0400 Bill McGonigle <[EMAIL PROTECTED]> wrote: > In the building I'm in the heat was over 100 degrees on this past > Monday morning. The company I work for is in a fairly large office complex (Riverside Center in Newton) where they turn the A/C off on weekends and holidays. A coworker and I each have an HP Integrity system as our workstations, and on Monday mornings it is quite warm in the office. However, I have yet had either system complain. I can see that the ILO logs report temperature going out of range. -- -- Jerry Feldman <[EMAIL PROTECTED]> Boston Linux and Unix PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846 signature.asc Description: PGP signature ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Mar 21, 2008, at 21:33, Paul Lussier wrote: > Nope, and I didn't say the 2.4 kernel wasn't vulnerable, just that > it's possible to have a stable-running kernel old enough to not have > the vmslice problem... :) Hey, if you read the _rest_ of the message you quoted originally you can even find out about 2.6 kernels that are old enough not to have the vmsplice problem. ;) -Bill - Bill McGonigle, Owner Work: 603.448.4440 BFC Computing, LLC Home: 603.448.1668 [EMAIL PROTECTED] Cell: 603.252.2606 http://www.bfccomputing.com/Page: 603.442.1833 Blog: http://blog.bfccomputing.com/ VCard: http://bfccomputing.com/vcard/bill.vcf ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Fri, 2008-03-21 at 20:26 -0400, Mark Komarinski wrote: > Bill McGonigle wrote: > > It's also connected naked to the Internet for remote monitoring. > > > > For some strange reason, you'd just accept this. > > > Venturing even further off-topic, I have two different labs that wrote > code without really consulting anyone else. One thought it would save a > lot of time if their fat client application connected directly to Oracle > from anywhere on the Internet. The other thought it would be a good > idea to do the same with MySQL. The Oracle group we didn't have much > luck with and they're stuck using SSH port forwarding. The MySQL group > seems to be a bit more receptive and they may change their application > to go through a web service instead. I'm not arguing against the web service approach. However, you can use SSL certificates to control (and encrypt) MySQL access. That offers reasonable security at the cost of yet another thing to configure and worry about. > > I keep hoping that bioinformatics courses include even a week or two on > 'sane coding practices', but I doubt it will happen... > > -Mark > ___ > gnhlug-discuss mailing list > gnhlug-discuss@mail.gnhlug.org > http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ -- Lloyd Kvam Venix Corp DLSLUG/GNHLUG library http://www.librarything.com/catalog/dlslug http://www.librarything.com/profile/dlslug http://www.librarything.com/rsshtml/recent/dlslug ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
"Ben Scott" <[EMAIL PROTECTED]> writes: > On Fri, Mar 21, 2008 at 9:20 AM, Paul Lussier <[EMAIL PROTECTED]> wrote: >>> Hey, it's possible that Warren's kernel is so old that he doesn't >>> suffer from the vmslice() exploit. :) >> >> Sure it's possible. We're not vulnerable to it anywhere, we're still >> running a 2.4 kernel :) > > Though, as I mentioned, the 2.4 kernel has also had security > advisories in the past two years. So 2.4 systems aren't immune to > reboots, either. Not that that should be news. :) Nope, and I didn't say the 2.4 kernel wasn't vulnerable, just that it's possible to have a stable-running kernel old enough to not have the vmslice problem... :) -- Seeya, Paul ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
Bill McGonigle wrote: > It's also connected naked to the Internet for remote monitoring. > > For some strange reason, you'd just accept this. > Venturing even further off-topic, I have two different labs that wrote code without really consulting anyone else. One thought it would save a lot of time if their fat client application connected directly to Oracle from anywhere on the Internet. The other thought it would be a good idea to do the same with MySQL. The Oracle group we didn't have much luck with and they're stuck using SSH port forwarding. The MySQL group seems to be a bit more receptive and they may change their application to go through a web service instead. I keep hoping that bioinformatics courses include even a week or two on 'sane coding practices', but I doubt it will happen... -Mark ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Mar 21, 2008, at 09:46, Ben Scott wrote: > From the "If Microsoft Made Cars" list: "Occasionally your car's > engine would just stop for no reason, and you'd have to restart it. > For some strange reason, you'd just accept this." In the building I'm in the heat was over 100 degrees on this past Monday morning. It seems sometime on Saturday, "something happened" and the heat got stuck on full blast. Apparently, the things that 'are controlled' just stay in the state of the last command sent by the thing that 'is controlling'. In this case the last command was 'heat, 100%'. To make a long story short, after chatting up the tech, it seems the control systems run WinCE and they have a habit of corrupting their filesystems. The tech knows how to reflash the image and restore the config to get it working again. It's also connected naked to the Internet for remote monitoring. For some strange reason, you'd just accept this. "Jane, get me off this crazy thing!"-ly, -Bill - Bill McGonigle, Owner Work: 603.448.4440 BFC Computing, LLC Home: 603.448.1668 [EMAIL PROTECTED] Cell: 603.252.2606 http://www.bfccomputing.com/Page: 603.442.1833 Blog: http://blog.bfccomputing.com/ VCard: http://bfccomputing.com/vcard/bill.vcf ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Fri, 21 Mar 2008 09:46:03 -0400 "Ben Scott" <[EMAIL PROTECTED]> wrote: > From the "If Microsoft Made Cars" list: "Occasionally your car's > engine would just stop for no reason, and you'd have to restart it. > For some strange reason, you'd just accept this." I'm a pilot, and fortunately Microsoft did not make the turbine in my Huey in Viet Nam. Never had an engine failure. -- -- Jerry Feldman <[EMAIL PROTECTED]> Boston Linux and Unix PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846 signature.asc Description: PGP signature ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Thu, Mar 20, 2008 at 5:42 PM, Mark E. Mallett <[EMAIL PROTECTED]> wrote: > On Thu, Mar 20, 2008 at 09:46:04AM -0400, Jerry Feldman wrote: > > On Wed, 19 Mar 2008 21:38:52 -0400 > > "Mark E. Mallett" <[EMAIL PROTECTED]> wrote: > > > > > sometimes it's good to reboot a system just to make sure you can. > > > > That's very old school :-) > > But all of that is completely different from what I said. I agree that > software can keep running without a reboot. But as I mentioned, > sometimes a reboot will find something that you can't possibly find by > keeping a system running. Like some of the things I listed. My point > is that a planned reboot can help protect you from surprises that you > might learn only from an unplanned reboot. > I was at one place that used OpenBSD for its firewall systems. And had several throughout its network to isolate potential security problems (the printers were firewalled off on thier own subnet for example). Once a week, *all* the firewalls were rebooted. This was primarily disconnected any SSH connections and I think it was a good thing for that environment. FWIW, the systems almost never needed patches because only needed services & programs were installed. No compilers, editors, shells, etc. A firewall doesn't need email so it's not installed. If there's a hole in email, it doesn't exist to be exploited. While I was there a cisco vulnerability came out with network logins. We had deleted them and could only admin/access via a serial cable from another system. Therefore, no patch needed. ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Fri, Mar 21, 2008 at 9:16 AM, Paul Lussier <[EMAIL PROTECTED]> wrote: > Our Windows server from what I'm told get rebooted once week whether > they need it or not, in the name of 'Preventative Maintenance" :) Sadly, that's an attitude that's quite prevalent in the Windows world, even though the OS itself has come a long way[1]. This is unfortunate, as it creates a self-reinforcing loop where operators are used to rebooting their servers all the time, and thus third-party vendors think it is okay for their drivers/applications to need restarting on a regular basis, thus keeping operators in the habbit. As Tom Buskey and Paul Lussier have observed, this attitude has become so prevalent that we now have to reboot our cell phones and printers. From the "If Microsoft Made Cars" list: "Occasionally your car's engine would just stop for no reason, and you'd have to restart it. For some strange reason, you'd just accept this." === Footnotes === [1] It used to be that one often needed regular reboots to keep Windows healthy.[2] [2] I'm talking Windows NT[3], here. Even people who like Windows[4] don't consider classic Windows[5] to be a real operating system. [3] Later releases of Windows NT include Windows 2000, XP, 2003, Vista, and 2008. [4] Not me. I have to know how to manage Windows for professional reasons, and I can work in the environment, but saying I "like Windows" would be going too far. [5] "Classic Windows" being my term for what began as Windows 1.0, went through 3.x, then 95 and 98, and finally ended with Me. -- Ben ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Fri, Mar 21, 2008 at 9:20 AM, Paul Lussier <[EMAIL PROTECTED]> wrote: >> Hey, it's possible that Warren's kernel is so old that he doesn't >> suffer from the vmslice() exploit. :) > > Sure it's possible. We're not vulnerable to it anywhere, we're still > running a 2.4 kernel :) Though, as I mentioned, the 2.4 kernel has also had security advisories in the past two years. So 2.4 systems aren't immune to reboots, either. Not that that should be news. :) -- Ben ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
Bill McGonigle <[EMAIL PROTECTED]> writes: > On Mar 19, 2008, at 15:36, Ben Scott wrote: > >> You're obviously not installing all your security updates, then. >> Both the 2.4 and 2.6 Debian kernels have had security advisories >> posted within the past two years. > > Hey, it's possible that Warren's kernel is so old that he doesn't > suffer from the vmslice() exploit. :) Sure it's possible. We're not vulnerable to it anywhere, we're still running a 2.4 kernel :) -- Seeya, Paul ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
"Tom Buskey" <[EMAIL PROTECTED]> writes: > /me is thankful he doesn't have to reboot his laser printer yet. We do :( -- Seeya, Paul ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
Warren Luebkeman <[EMAIL PROTECTED]> writes: > I am curious how common it is for peoples servers to go extremely long > periods of time without crashing/reboot. Our server, running Debian > Sarge, which serves our email/web/backups/dns/etc has been running 733 > days (two years) without a reboot. Its in an 4U IBM chassis with dual > power supplies, which was old when we fired it up (PIII Server). My desktop at work went close to 600 days without a reboot, but blown transformer at the substation forced the issue :( We regularly see uptimes over a year, but we tend to take them down for maintenance issues more often than that if we can. Our Windows server from what I'm told get rebooted once week whether they need it or not, in the name of 'Preventative Maintenance" :) -- Seeya, Paul ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Thu, 20 Mar 2008 17:42:57 -0400 "Mark E. Mallett" <[EMAIL PROTECTED]> wrote: > But all of that is completely different from what I said. I agree that > software can keep running without a reboot. But as I mentioned, > sometimes a reboot will find something that you can't possibly find by > keeping a system running. Like some of the things I listed. My point > is that a planned reboot can help protect you from surprises that you > might learn only from an unplanned reboot. Most of the newer servers today are equipped with remote console services, such as HP's ILO. Essentially, a planned perdiodic reboot is certainly a decent idea, but in the case where the servers are in a server farm where access is inconvenient, it may not be all that frequent. -- -- Jerry Feldman <[EMAIL PROTECTED]> Boston Linux and Unix PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846 signature.asc Description: PGP signature ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
> "Mark E. Mallett" <[EMAIL PROTECTED]> wrote: > Relatedly, a group of systems can get into a state where it's hard to > reboot the whole group back into that state. This can happen when you > build up a collection of services and servers over time, but never from > scratch. e.g. you might have a system A that uses a service on system B > during its boot process, and vice versa (although bigger trouble can > come with harder to find dependency loops that creep in through some > crack in the plans). > > mm Even if there are no dependency loops, a missing init script can cause serious problems if you don't catch it right away - particularly if the person who was originally responsible for installing said service is no longer working with you. (Of course, this is at least as much a problem of organization as it is technology: you should have access to documentation - or at least human expertise - on all of the pieces of your production system, and doubly so for the custom ones.) Take it easy, David Berube Berube Consulting [EMAIL PROTECTED] (603)-485-9622 http://www.berubeconsulting.com/ ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Thu, Mar 20, 2008 at 09:46:04AM -0400, Jerry Feldman wrote: > On Wed, 19 Mar 2008 21:38:52 -0400 > "Mark E. Mallett" <[EMAIL PROTECTED]> wrote: > > > sometimes it's good to reboot a system just to make sure you can. > > That's very old school :-) thank you :) > Back in the days where mainframes had the power of my PDA, operating > systems were somewhat unsophisticated. I ran a data center in San > Antonio where we ran VM/370 - IBM's virtualization, with the batch os > (OS/VS1) in one VM, and CMS for online users - data control and > programmers. The thinking back then if we shutdown we may never get > the system back up, but this was 1950s mentality. But, memory leaks and > things could cause the OS to degrade over time. > > Today, for the most part, the Linux and Unix kernels really do not need > periodic reboots unless there is a problem. On out BLU mail server > we've seen that the routing table gets screwed up and is difficult to > fix. In any case, since nearly every service and driver can be stopped > and started remotely, the only reason I might want to reboot other than > a kernel upgrade is that it might be faster to reboot than to try to > fix an issue, but that tends to be more of a Microsoft mentality, but > Windows Server has become more stable also. But all of that is completely different from what I said. I agree that software can keep running without a reboot. But as I mentioned, sometimes a reboot will find something that you can't possibly find by keeping a system running. Like some of the things I listed. My point is that a planned reboot can help protect you from surprises that you might learn only from an unplanned reboot. Note that this is a "do as I say not as I do" kind of remark. I never actually reboot anything for that reason, I just think it's a good reason :) Relatedly, a group of systems can get into a state where it's hard to reboot the whole group back into that state. This can happen when you build up a collection of services and servers over time, but never from scratch. e.g. you might have a system A that uses a service on system B during its boot process, and vice versa (although bigger trouble can come with harder to find dependency loops that creep in through some crack in the plans). mm ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Thu, 2008-03-20 at 13:48 -0400, Warren Luebkeman wrote: > Nah, we are not vulnerable to that exploit. We do keep tabs on important > security issues when they come up. We plan to retire that server pretty > soon, although I may leave it running behind the firewall, just to see how > long it goes... ;-) > Come to think of it, isn't drywall just another name for firewall? -Alex > - Original Message - > From: "Bill McGonigle" <[EMAIL PROTECTED]> > To: "Warren Luebkeman" <[EMAIL PROTECTED]>, "Benjamin Scott" <[EMAIL > PROTECTED]> > Cc: "Greater NH Linux User Group" > Sent: Thursday, March 20, 2008 1:41:25 PM (GMT-0500) America/New_York > Subject: Re: server uptime > > On Mar 19, 2008, at 15:36, Ben Scott wrote: > > > You're obviously not installing all your security updates, then. > > Both the 2.4 and 2.6 Debian kernels have had security advisories > > posted within the past two years. > > Hey, it's possible that Warren's kernel is so old that he doesn't > suffer from the vmslice() exploit. :) > > Seriously, though - check. If `uname -r` >= 2.6.17, vmsplice() plus > one (e.g.) PHP bug = remote root exploit. That's bad, mmmkay? > > Perhaps more importantly you're not picking up ext3 bugfixes, the CQF > elevator, etc. > > And somebody around here actually found an old Netware box running in > a closet that had been drywalled over 5 years before. It was > apparently still serving files and print jobs (they traced the > ethernet cable). > > -Bill > > - > Bill McGonigle, Owner Work: 603.448.4440 > BFC Computing, LLC Home: 603.448.1668 > [EMAIL PROTECTED] Cell: 603.252.2606 > http://www.bfccomputing.com/Page: 603.442.1833 > Blog: http://blog.bfccomputing.com/ > VCard: http://bfccomputing.com/vcard/bill.vcf > > > ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
Nah, we are not vulnerable to that exploit. We do keep tabs on important security issues when they come up. We plan to retire that server pretty soon, although I may leave it running behind the firewall, just to see how long it goes... ;-) - Original Message - From: "Bill McGonigle" <[EMAIL PROTECTED]> To: "Warren Luebkeman" <[EMAIL PROTECTED]>, "Benjamin Scott" <[EMAIL PROTECTED]> Cc: "Greater NH Linux User Group" Sent: Thursday, March 20, 2008 1:41:25 PM (GMT-0500) America/New_York Subject: Re: server uptime On Mar 19, 2008, at 15:36, Ben Scott wrote: > You're obviously not installing all your security updates, then. > Both the 2.4 and 2.6 Debian kernels have had security advisories > posted within the past two years. Hey, it's possible that Warren's kernel is so old that he doesn't suffer from the vmslice() exploit. :) Seriously, though - check. If `uname -r` >= 2.6.17, vmsplice() plus one (e.g.) PHP bug = remote root exploit. That's bad, mmmkay? Perhaps more importantly you're not picking up ext3 bugfixes, the CQF elevator, etc. And somebody around here actually found an old Netware box running in a closet that had been drywalled over 5 years before. It was apparently still serving files and print jobs (they traced the ethernet cable). -Bill - Bill McGonigle, Owner Work: 603.448.4440 BFC Computing, LLC Home: 603.448.1668 [EMAIL PROTECTED] Cell: 603.252.2606 http://www.bfccomputing.com/Page: 603.442.1833 Blog: http://blog.bfccomputing.com/ VCard: http://bfccomputing.com/vcard/bill.vcf -- Warren Luebkeman Founder, COO Resara LLC 888.357.9195 www.resara.com ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Thu, 2008-03-20 at 13:41 -0400, Bill McGonigle wrote: > On Mar 19, 2008, at 15:36, Ben Scott wrote: > > > You're obviously not installing all your security updates, then. > > Both the 2.4 and 2.6 Debian kernels have had security advisories > > posted within the past two years. > > Hey, it's possible that Warren's kernel is so old that he doesn't > suffer from the vmslice() exploit. :) > > Seriously, though - check. If `uname -r` >= 2.6.17, vmsplice() plus > one (e.g.) PHP bug = remote root exploit. That's bad, mmmkay? > > Perhaps more importantly you're not picking up ext3 bugfixes, the CQF > elevator, etc. > > And somebody around here actually found an old Netware box running in > a closet that had been drywalled over 5 years before. It was > apparently still serving files and print jobs (they traced the > ethernet cable). Maybe instead of uptime it should be renamed to closettime ;^) -Alex > > -Bill > > - > Bill McGonigle, Owner Work: 603.448.4440 > BFC Computing, LLC Home: 603.448.1668 > [EMAIL PROTECTED] Cell: 603.252.2606 > http://www.bfccomputing.com/Page: 603.442.1833 > Blog: http://blog.bfccomputing.com/ > VCard: http://bfccomputing.com/vcard/bill.vcf > > ___ > gnhlug-discuss mailing list > gnhlug-discuss@mail.gnhlug.org > http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Mar 19, 2008, at 15:36, Ben Scott wrote: > You're obviously not installing all your security updates, then. > Both the 2.4 and 2.6 Debian kernels have had security advisories > posted within the past two years. Hey, it's possible that Warren's kernel is so old that he doesn't suffer from the vmslice() exploit. :) Seriously, though - check. If `uname -r` >= 2.6.17, vmsplice() plus one (e.g.) PHP bug = remote root exploit. That's bad, mmmkay? Perhaps more importantly you're not picking up ext3 bugfixes, the CQF elevator, etc. And somebody around here actually found an old Netware box running in a closet that had been drywalled over 5 years before. It was apparently still serving files and print jobs (they traced the ethernet cable). -Bill - Bill McGonigle, Owner Work: 603.448.4440 BFC Computing, LLC Home: 603.448.1668 [EMAIL PROTECTED] Cell: 603.252.2606 http://www.bfccomputing.com/Page: 603.442.1833 Blog: http://blog.bfccomputing.com/ VCard: http://bfccomputing.com/vcard/bill.vcf ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Wed, 19 Mar 2008 21:38:52 -0400 "Mark E. Mallett" <[EMAIL PROTECTED]> wrote: > sometimes it's good to reboot a system just to make sure you can. That's very old school :-) Back in the days where mainframes had the power of my PDA, operating systems were somewhat unsophisticated. I ran a data center in San Antonio where we ran VM/370 - IBM's virtualization, with the batch os (OS/VS1) in one VM, and CMS for online users - data control and programmers. The thinking back then if we shutdown we may never get the system back up, but this was 1950s mentality. But, memory leaks and things could cause the OS to degrade over time. Today, for the most part, the Linux and Unix kernels really do not need periodic reboots unless there is a problem. On out BLU mail server we've seen that the routing table gets screwed up and is difficult to fix. In any case, since nearly every service and driver can be stopped and started remotely, the only reason I might want to reboot other than a kernel upgrade is that it might be faster to reboot than to try to fix an issue, but that tends to be more of a Microsoft mentality, but Windows Server has become more stable also. -- -- Jerry Feldman <[EMAIL PROTECTED]> Boston Linux and Unix PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846 signature.asc Description: PGP signature ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Wed, Mar 19, 2008 at 9:38 PM, Mark E. Mallett <[EMAIL PROTECTED]> wrote: > On Wed, Mar 19, 2008 at 08:23:14PM -0400, Ben Scott wrote: > > > > And let's not forget that Linux isn't immune to restart-the-world > > issues, either. For example, on a Linux server, if you update glibc > > to patch a security bug, you pretty much need to restart *everything*. > > sometimes it's good to reboot a system just to make sure you can. > i.e., that you haven't introduced deadlocks or dependencies or > gremlins or changed some externality that the boot process depends > on or that some flash memory hasn't rotted or ... While I'm not denying the wisdom of this, I hate how that has become the standard answer. I reboot my Windows box, I reboot my FiOS router, I reboot my Blackberry, I reboot my TiVos etc. My linux & solaris boxes only get rebooted when there's a major system upgrade (driver, kernel, etc). I suspect I'll have to get used to rebooting my coffee machine, clock, microwave, car, answering machine, TV as computers control more parts. When did rebooting on a regular basis become acceptable? /me is thankful he doesn't have to reboot his laser printer yet. ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
Mark E. Mallett wrote: > On Wed, Mar 19, 2008 at 08:23:14PM -0400, Ben Scott wrote: > >> And let's not forget that Linux isn't immune to restart-the-world >> issues, either. For example, on a Linux server, if you update glibc >> to patch a security bug, you pretty much need to restart *everything*. >> > > sometimes it's good to reboot a system just to make sure you can. > i.e., that you haven't introduced deadlocks or dependencies or > gremlins or changed some externality that the boot process depends > on or that some flash memory hasn't rotted or ... > fsck, clean out the dust bunnies, dead mice (we actually did find those), make sure the fans (and hard drives) still turn when it comes back on, ... We reboot our servers every two years, whether they need it or not. :-) -- Dan Jenkins, Rastech Inc. ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Wed, Mar 19, 2008 at 08:23:14PM -0400, Ben Scott wrote: > > And let's not forget that Linux isn't immune to restart-the-world > issues, either. For example, on a Linux server, if you update glibc > to patch a security bug, you pretty much need to restart *everything*. sometimes it's good to reboot a system just to make sure you can. i.e., that you haven't introduced deadlocks or dependencies or gremlins or changed some externality that the boot process depends on or that some flash memory hasn't rotted or ... mm ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Wed, Mar 19, 2008 at 5:01 PM, Warren Luebkeman <[EMAIL PROTECTED]> wrote: > I'm impressed that a system could run for two years straight without failing > ... Ah. Well... that gets old after awhile. :) At the extreme end of the scale, old school IBM mainframe systems can measure service availability in decades. When everything is done via batch transaction processing in virtual machines, it's easy to run redundant systems in geographically separate areas. If a nuclear bomb vaporizes the East coast data center, the pending transactions get committed at the West coast data center instead. Back in the days of NetWare 3.12, the only time I had servers go down was on hardware or power failure. With the right equipment (UPS, generator, and good server hardware), uptimes measured in the hundreds or thousands of days were the norm. NetWare 3.x didn't do very much -- LAN file and print were really about it -- so it wasn't hard to keep it stable. And there wasn't anything like a public IPX network (in contrast to the public IP network we call "the Internet"), so exploitation of software flaws was rarer (inside jobs usually find easier attacks). Even an unpatched Windows NT 4.0 box can stay up for years. Don't connect it to a network, or install any software, or log-in, or use it. Strangely enough, nobody is very impressed by that scenario. ;-) At work, we've only got batteries for about 10-15 minutes of runtime, and we average about one prolonged power outage a year, so that limits our uptime. (During a prolonged outage, they send everybody home, so buying more batteries wouldn't pay off.) At home, my Linux desktop PC rarely gets more than a few weeks of uptime. I don't have it on a UPS, I like to experiment with different distros (lots of reboots for that), and occasionally I play Wintendo. In fact, my Windows XP PC at work probably has better uptime numbers than my Linux PC at home (I've got a UPS at work). At work, our Linux servers could probably go for years, if it wasn't for power failures and kernel security holes. But even the Windows servers usually get at least a few months of uptime, before some update we need to install also needs a reboot to get installed. Windows has a lot of stuff that can't be updated without rebooting the whole system. On Linux, similar updates just mean doing things like restarting Samba. But in both cases, the service is unavailable during the update, so the distinction is largely academic. It still means the users can't use the server, and so it still means I'm there after hours to do the update. I don't really care if the uptime counter gets reset or not. Linux is easier, and cheaper, but it's convenience more than life-changing. And let's not forget that Linux isn't immune to restart-the-world issues, either. For example, on a Linux server, if you update glibc to patch a security bug, you pretty much need to restart *everything*. Sorry to be a stick-in-the-mud, but I've been doing IT for 15+ years. There are people on this list who have been doing IT for longer than I've been alive. After awhile, you start seeing the big picture. -- Ben ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
There is no question continuity of service is more important than "uptime" alone. I guess I'm just being a rube, addmittantly so, because I'm impressed that a system could run for two years straight without failing, notwithstanding the "big picture" of service availability. I guess my only point is, I just think its cool... - Original Message - From: "David J Berube" <[EMAIL PROTECTED]> To: "GNHLUG mailing list" Sent: Wednesday, March 19, 2008 4:46:24 PM (GMT-0500) America/New_York Subject: Re: server uptime Got to agree with Ben here. While it's bad if a server can't go 24 hours due to an OS-level problem, it's also inaccurate to say that a long uptime implies high service availability. This is doubly so if you are hosting software: not only does your service need to be available, but it needs to respond to changing business demands and other technical issues - including OS and application level security threats - and you need to be able to change it respond quickly. If you cannot do that, then you have a technical failure resulting in what is effectively downtime for your service: if your users can't use your service in a way that works for them, then you have an outage. There are other issues as well, of course: one of my clients recently had severe trouble with upstream providers of bandwidth; on a 100mbps connection, we were getting under 1mbps throughput. While this wasn't a hardware problem, and it wasn't a software problem, and it wasn't even a network problem at the host level, it nonetheless resulted in a substandard level of service, which was, in effect, an outage for effected users. In short, uptimes of individual components are not especially relevant; if a machine can be occasionally brought down for repair or maintenance without resulting in an effective lack of availability for end users, then an extremely uptime figure is meaningless - an extremely short uptime figure, of course, still has relevance. If an individual component cannot at any time afford downtime, then the problem is not with the component: the problem is with your architecture, as all components fail occasionally, and if it is truly important that that component never goes down, you need more redundancy, which should be sufficient to, again, allow a brief period of maintenance for any given component. Take it easy, David Berube Berube Consulting [EMAIL PROTECTED] (603)-485-9622 http://www.berubeconsulting.com/ Ben Scott wrote: > On Wed, Mar 19, 2008 at 1:50 PM, Warren Luebkeman <[EMAIL PROTECTED]> wrote: >> Our server, running Debian Sarge, which serves our email/web/backups/dns/etc >> has been running 733 days (two years) without a reboot. > > You're obviously not installing all your security updates, then. > Both the 2.4 and 2.6 Debian kernels have had security advisories > posted within the past two years. > > In my experience, discussions about uptime typically involve > approximately the same mentality as a penis-length competition. > Especially since nobody really cares about what uptime(1) shows -- > it's service level availability that counts. Who cares if your kernel > hasn't been restarted but the email service was down for a month, or > slow, or if your company's data is being harvested by a cracker who > used some unpatched security holes to break in. > > -- Ben > ___ > gnhlug-discuss mailing list > gnhlug-discuss@mail.gnhlug.org > http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ > > ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ -- Warren Luebkeman Founder, COO Resara LLC 888.357.9195 www.resara.com ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
Got to agree with Ben here. While it's bad if a server can't go 24 hours due to an OS-level problem, it's also inaccurate to say that a long uptime implies high service availability. This is doubly so if you are hosting software: not only does your service need to be available, but it needs to respond to changing business demands and other technical issues - including OS and application level security threats - and you need to be able to change it respond quickly. If you cannot do that, then you have a technical failure resulting in what is effectively downtime for your service: if your users can't use your service in a way that works for them, then you have an outage. There are other issues as well, of course: one of my clients recently had severe trouble with upstream providers of bandwidth; on a 100mbps connection, we were getting under 1mbps throughput. While this wasn't a hardware problem, and it wasn't a software problem, and it wasn't even a network problem at the host level, it nonetheless resulted in a substandard level of service, which was, in effect, an outage for effected users. In short, uptimes of individual components are not especially relevant; if a machine can be occasionally brought down for repair or maintenance without resulting in an effective lack of availability for end users, then an extremely uptime figure is meaningless - an extremely short uptime figure, of course, still has relevance. If an individual component cannot at any time afford downtime, then the problem is not with the component: the problem is with your architecture, as all components fail occasionally, and if it is truly important that that component never goes down, you need more redundancy, which should be sufficient to, again, allow a brief period of maintenance for any given component. Take it easy, David Berube Berube Consulting [EMAIL PROTECTED] (603)-485-9622 http://www.berubeconsulting.com/ Ben Scott wrote: > On Wed, Mar 19, 2008 at 1:50 PM, Warren Luebkeman <[EMAIL PROTECTED]> wrote: >> Our server, running Debian Sarge, which serves our email/web/backups/dns/etc >> has been running 733 days (two years) without a reboot. > > You're obviously not installing all your security updates, then. > Both the 2.4 and 2.6 Debian kernels have had security advisories > posted within the past two years. > > In my experience, discussions about uptime typically involve > approximately the same mentality as a penis-length competition. > Especially since nobody really cares about what uptime(1) shows -- > it's service level availability that counts. Who cares if your kernel > hasn't been restarted but the email service was down for a month, or > slow, or if your company's data is being harvested by a cracker who > used some unpatched security holes to break in. > > -- Ben > ___ > gnhlug-discuss mailing list > gnhlug-discuss@mail.gnhlug.org > http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ > > ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
Yes, Ben is trying to say that it's not the length of your uptime, but how you use it. No one is buying it though. Warren Luebkeman wrote: > Sounds like someone is insecure about their uptime... ;-) > > I do understand your point thought. > > ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
Sounds like someone is insecure about their uptime... ;-) I do understand your point thought. - Original Message - From: "Ben Scott" <[EMAIL PROTECTED]> To: "Greater NH Linux User Group" Sent: Wednesday, March 19, 2008 3:36:59 PM (GMT-0500) America/New_York Subject: Re: server uptime On Wed, Mar 19, 2008 at 1:50 PM, Warren Luebkeman <[EMAIL PROTECTED]> wrote: > Our server, running Debian Sarge, which serves our email/web/backups/dns/etc > has been running 733 days (two years) without a reboot. You're obviously not installing all your security updates, then. Both the 2.4 and 2.6 Debian kernels have had security advisories posted within the past two years. In my experience, discussions about uptime typically involve approximately the same mentality as a penis-length competition. Especially since nobody really cares about what uptime(1) shows -- it's service level availability that counts. Who cares if your kernel hasn't been restarted but the email service was down for a month, or slow, or if your company's data is being harvested by a cracker who used some unpatched security holes to break in. -- Ben ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ -- Warren Luebkeman Founder, COO Resara LLC 888.357.9195 www.resara.com ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Mar 19, 2008, at 3:36 PM, Ben Scott wrote: > On Wed, Mar 19, 2008 at 1:50 PM, Warren Luebkeman > <[EMAIL PROTECTED]> wrote: >> Our server, running Debian Sarge, which serves our email/web/ >> backups/dns/etc >> has been running 733 days (two years) without a reboot. > > You're obviously not installing all your security updates, then. > Both the 2.4 and 2.6 Debian kernels have had security advisories > posted within the past two years. > > In my experience, discussions about uptime typically involve > approximately the same mentality as a penis-length competition. > Especially since nobody really cares about what uptime(1) shows -- > it's service level availability that counts. Who cares if your kernel > hasn't been restarted but the email service was down for a month, or > slow, or if your company's data is being harvested by a cracker who > used some unpatched security holes to break in. +1 -- Jarod Wilson [EMAIL PROTECTED] ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Wed, Mar 19, 2008 at 1:50 PM, Warren Luebkeman <[EMAIL PROTECTED]> wrote: > Our server, running Debian Sarge, which serves our email/web/backups/dns/etc > has been running 733 days (two years) without a reboot. You're obviously not installing all your security updates, then. Both the 2.4 and 2.6 Debian kernels have had security advisories posted within the past two years. In my experience, discussions about uptime typically involve approximately the same mentality as a penis-length competition. Especially since nobody really cares about what uptime(1) shows -- it's service level availability that counts. Who cares if your kernel hasn't been restarted but the email service was down for a month, or slow, or if your company's data is being harvested by a cracker who used some unpatched security holes to break in. -- Ben ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: server uptime
On Wed, 19 Mar 2008 14:32:54 -0400 Alex Hewitt <[EMAIL PROTECTED]> wrote: > In my experience the stability of any system has to do with it's usage. > With servers running programs that are reasonably stable up time will > certainly be many months and can stretch into years. Any system that for > example is running unpredictable loads such as one might find in a > time-sharing university setting are less likely to have long uptimes. > The bane of server operations are applications with memory leaks. If > these apps aren't restricted that will consume all available memory and > eventually cause the system to swap it's brains out. User space apps can > usually be prevented from taking the system down but a memory leak in a > service can easily make the system crash or become unavailable. We had an old ProLiant server running at the BLU with a 2 year+ uptime. The system finally died and we cannibalized it. One of the nicer things in Unix and Linux is that when you have an application with a memory leak you can easily cycle it down, even if it is a daemon. In Red Hat parlance service restart will clean up many things especially in web servers. Even drivers can be recycled without causing a reboot. I don't recall the circumstance, but I recycled a working network driver.. In this case I think I wrote a short script to rmmod the driver and to insmod the new one. The system remained up, and I was able to reconnect seconds later. -- -- Jerry Feldman <[EMAIL PROTECTED]> Boston Linux and Unix PGP key id: 537C5846 PGP Key fingerprint: 3D1B 8377 A3C0 A5F2 ECBB CA3B 4607 4319 537C 5846 signature.asc Description: PGP signature ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
re: server uptime
On Wed, 2008-03-19 at 13:50 -0400, Warren Luebkeman wrote: > I am curious how common it is for peoples servers to go extremely long > periods of time without crashing/reboot. Our server, running Debian Sarge, > which serves our email/web/backups/dns/etc has been running 733 days (two > years) without a reboot. Its in an 4U IBM chassis with dual power supplies, > which was old when we fired it up (PIII Server). > > Does anyone have similar uptime on their mission critical servers? Whats the > longest uptime someone has had with Windows? > ___ > gnhlug-discuss mailing list > gnhlug-discuss@mail.gnhlug.org > http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ DEC had a customer who owned an AlphaServer 2100 for 7 years. In that time the server was rebooted exactly once due to patch kit installation (it ran VMS). In my experience the stability of any system has to do with it's usage. With servers running programs that are reasonably stable up time will certainly be many months and can stretch into years. Any system that for example is running unpredictable loads such as one might find in a time-sharing university setting are less likely to have long uptimes. The bane of server operations are applications with memory leaks. If these apps aren't restricted that will consume all available memory and eventually cause the system to swap it's brains out. User space apps can usually be prevented from taking the system down but a memory leak in a service can easily make the system crash or become unavailable. -Alex P.S. Interesting stats to collect from a system that has a long uptime are the load averages for CPU, memory and I/O. ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
re: server uptime
I am curious how common it is for peoples servers to go extremely long periods of time without crashing/reboot. Our server, running Debian Sarge, which serves our email/web/backups/dns/etc has been running 733 days (two years) without a reboot. Its in an 4U IBM chassis with dual power supplies, which was old when we fired it up (PIII Server). Does anyone have similar uptime on their mission critical servers? Whats the longest uptime someone has had with Windows? ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/