Re: [Users] Shutdown problems
To clarify further on versions: The HN is 2.6.24 ovz009.1 on Fedora 9. We must use 2.6.24 despite its "development" status because 2.6.18 lacks support for AMCC/3ware RAID controllers. Aside from this shutdown issue, we have used it for 14 months now under high loads without issue. Aside from this shutdown issue, we consider it stable and production-grade. -- HostGIS, Open Source solutions for the global GIS community Greg Allensworth - SysAdmin, Programmer, GIS Person, Security Network+ Server+ A+ Security+ "No one cares if you can back up — only if you can recover." ___ Users mailing list Users@openvz.org https://openvz.org/mailman/listinfo/users
Re: [Users] Shutdown problems
have you searched for a bug report or checked the forums? Has this issue been reported? I'll take the "if you don't have anything nice to say, don't say anything" clause for my previous experience with forums. Maybe the OpenVZ forum is unlike any other, but honestly the thought of posting to a forum didn't occur to me with any degree of seriousness. I did post to this list several months ago, and got no replies. But, I am surprised that the bug I filed last year is not visible. Boy do I feel like a luser; I bet I closed the wrong tab or something and never finished filing it -- no wonder nobody replied. Per Thorsten's request I will indeed file the bug report; maybe the kernel folks can verify that it's been solved or maybe something really weird is going on. Red Hat updates their 2.6.18 kernel [...] > in the process they backport a lot of drivers > [...] current OpenVZ RHEL5-based kernel to see if it > is compatible with your hardware? According to the kernel config file inside the RPM, RHEL5-2.6.18 now has the 3W-9000 driver. Good to know; that may indeed be a realistic option and an easy fix. So, can you please clarify: The RHEL5 kernel is usable and "recommended" if I'm NOT running RHEL5? You said you are using Fedora 9 and that was EOLed a while ago. Yeah, and it was the latest thing 14 months ago when we deployed this thing. I can't say I'm completely pleased with that fast a retirement cycle. You say that CentOS is a fine base OS? I have worked with it in a few occasions, and find it similar enough to Fedora that I like it. Changing out base OSs isn't something I'd do lightly, but is something I would consider as a longer-term plan if it improved long-term support or fixed this bug we're having. Beyond this one bug, things are going great. You mentioned you are using veth devices Correct. In my initial setup of the pilot systems veth was the only thing that worked, so we went with it. Why do you ask, and what do you mean about it being a security issue? (I did ask about ARP and IP spoofing on the list some months back, and that one also got no replies.) What distro or distros are you running in your containers? This seems to happen equally with both of our offerings currently deployed: Ubuntu 8.04 and HostGIS Linux 4.2 (for your purposes, think Slamd64 12). -- HostGIS, Open Source solutions for the global GIS community Greg Allensworth - SysAdmin, Programmer, GIS Person, Security Network+ Server+ A+ Security+ "No one cares if you can back up — only if you can recover." ___ Users mailing list Users@openvz.org https://openvz.org/mailman/listinfo/users
Re: [Users] Shutdown problems
Hi, can you please fill a bug at bugzilla.openvz.org and add info about: - what template OS is used - any log entries in syslog (kern|dmesg).log - which raid controller Bye, Thorsten HostGIS Support schrieb: > I emailed on the topic before, and have never found a solution -- nor > indeed, more than one other corroboration of the problem's existence. > But now, I have freed up a while server with OpenVZ where we can > experiment with it at will. > > The problem: Shutting down a VPS gives me a timeout after several > minutes. Although all processes in the container are dead, the container > itself will not finish shutting down. The veth device never goes down, > the container cannot be restarted, the phantom VPS will hang around > until I power-cycle the server. This interrupts shutdowns too: init 0 > and reboot never, ever work; they do nothing, they don't turn anything > off; and I have to pull the plug. > > Worse, this happens reliably -- I don't dare shut down a VPS unless it's > a migration, and I can manually complete the migration and startup, then > power-cycle the origin HN. > > BUT... Now we have a machine and some IPs with OpenVZ, and my current > project is to figure this thing out so we can reboot with confidence. > Where do we start and who's with me? :) ___ Users mailing list Users@openvz.org https://openvz.org/mailman/listinfo/users
Re: [Users] Shutdown problems
Greetings again, - "HostGIS Support" wrote: > To clarify further on versions: > > The HN is 2.6.24 ovz009.1 on Fedora 9. > > We must use 2.6.24 despite its "development" status because 2.6.18 > lacks support for AMCC/3ware RAID controllers. Aside from this shutdown > issue, we have used it for 14 months now under high loads without issue. > Aside from this shutdown issue, we consider it stable and production-grade. One other thing... it would be a good experiment to set up a CentOS 5 box with the OpenVZ RHEL5-based kernel... and migrate one of your containers to it... and see if it will shutdown properly there. Another good test would be to setup a machine using the same kernel you are currently using but without the RAID hardware and see if you still have the same problem with container shutdown. If so that would indicate that the problem is related to the RAID... although I'm guessing it is unrelated to the RAID but who knows. TYL, -- Scott Dowdle 704 Church Street Belgrade, MT 59714 (406)388-0827 [home] (406)994-3931 [work] ___ Users mailing list Users@openvz.org https://openvz.org/mailman/listinfo/users
Re: [Users] Shutdown problems
Greetings, - "HostGIS Support" wrote: > To clarify further on versions: > > The HN is 2.6.24 ovz009.1 on Fedora 9. > > We must use 2.6.24 despite its "development" status because 2.6.18 > lacks support for AMCC/3ware RAID controllers. Aside from this shutdown > issue, we have used it for 14 months now under high loads without issue. > Aside from this shutdown issue, we consider it stable and production-grade. As you may know, Red Hat updates their 2.6.18 kernel about ever6 six months with their update releases (example going from RHEL 5.2 -> 5.3) and in the process they backport a lot of drivers... although granted, the OpenVZ Project lags behind Red Hat releases. My point is though, have you tried the current OpenVZ RHEL5-based kernel to see if it is compatible with your hardware? I'm a Fedora fan, in fact I'm wearing a Fedora tee-shirt as I type this... and I love it on the desktop (used on the laptop I'm typing this from)... but it isn't a server OS unless you want to upgrade at least once a year. As you probably know, Fedora has a rapid 6 month release cycle and a very limited support cycle (2 months after a new release comes out, 2 releases back is EOLed). You said you are using Fedora 9 and that was EOLed a while ago. Now to actually address your issue... have you searched for a bug report or checked the forums? Has this issue been reported? I could have looked myself but I'd rather encourage you to get familiar with the OpenVZ bugzilla system (http://bugzilla.openvz.org/). You mentioned you are using veth devices... which I assume you had need to use... since they are more of a security risk than venet devices and have slightly more overhead. So, what are you running in the containers that require veth? I'm just wondering if we can find what processes, if any, are leading to your problem... although you did mention that all of the processes in the container are stopped. What distro or distros are you running in your containers? I believe the 2.6.24 kernel stays on the devel list (rather than being moved to retired status) because it is used in Ubuntu 8.04x LTS. 2.6.26 is available as a devel because it is used by Debian 5. 2.6.27 is there because the mainline kernel developers have stated they plan to maintain 2.6.27 for at least two more years. I don't know how much commitment from OpenVZ/Parallels exists for these devel kernels... although 2.6.27 does look lke the most natural target to make it to "stable". I also assume when RHEL6 comes out (whenever that might be), whatever kernel it is based on will also be a target for an OpenVZ stable kernel branch. I think your best bet is to join an existing bug report (if one exists) or start a new one... and work with the kernel developers to gather information about the problem so they can solve it... assuming that the latest RHEL5-based kernel still doesn't support your RAID hardware. Of course if the finding-a-fix process doesn't work out well for you within a reasonable amount of time, perhaps switching RAID cards would be an option... to something that is well supported in the RHEL5-based kernel. TYL, -- Scott Dowdle 704 Church Street Belgrade, MT 59714 (406)388-0827 [home] (406)994-3931 [work] ___ Users mailing list Users@openvz.org https://openvz.org/mailman/listinfo/users
Re: [Users] Shutdown problems
To clarify further on versions: The HN is 2.6.24 ovz009.1 on Fedora 9. We must use 2.6.24 despite its "development" status because 2.6.18 lacks support for AMCC/3ware RAID controllers. Aside from this shutdown issue, we have used it for 14 months now under high loads without issue. Aside from this shutdown issue, we consider it stable and production-grade. -- HostGIS, Open Source solutions for the global GIS community Greg Allensworth - SysAdmin, Programmer, GIS Person, Security Network+ Server+ A+ Security+ "No one cares if you can back up — only if you can recover." ___ Users mailing list Users@openvz.org https://openvz.org/mailman/listinfo/users
[Users] Shutdown problems
I emailed on the topic before, and have never found a solution -- nor indeed, more than one other corroboration of the problem's existence. But now, I have freed up a while server with OpenVZ where we can experiment with it at will. The problem: Shutting down a VPS gives me a timeout after several minutes. Although all processes in the container are dead, the container itself will not finish shutting down. The veth device never goes down, the container cannot be restarted, the phantom VPS will hang around until I power-cycle the server. This interrupts shutdowns too: init 0 and reboot never, ever work; they do nothing, they don't turn anything off; and I have to pull the plug. Worse, this happens reliably -- I don't dare shut down a VPS unless it's a migration, and I can manually complete the migration and startup, then power-cycle the origin HN. BUT... Now we have a machine and some IPs with OpenVZ, and my current project is to figure this thing out so we can reboot with confidence. Where do we start and who's with me? :) -- HostGIS, Open Source solutions for the global GIS community Greg Allensworth - SysAdmin, Programmer, GIS Person, Security Network+ Server+ A+ Security+ "No one cares if you can back up — only if you can recover." ___ Users mailing list Users@openvz.org https://openvz.org/mailman/listinfo/users