Re: [Ipmitool-devel] Is the BMC robust to recover from system hangs? impitool unresponsive
On Mon, Aug 30, 2010 at 7:52 AM, Jarrod B Johnson jbjoh...@us.ibm.comwrote: Your BMC simply isn't responding to any traffic. BMCs are supposed to be completely resilient to OS failures when done properly (not much apart from things like power failures in non-redundant systems should be capable of knocking out a quality IPMI implementation) . You need to look to your system vendor's support for an explanation and/or resolution, since implementations vary greatly from one vendor to the next. Sometimes a vendor is not competent to make it work, sometimes a vendor is too cheap to make it easy, and sometimes a vendor simply hasn't covered your particular NIC driver/OS combination and the NIC vendor flubbed some register handling or some such to make the NIC shoot itself when the kernel panics. Thanks for the tips Jarrod! I will look into the nodes. These are DellR410-servers with the on-board Broadcom NIC. The first thing for this Monday morning is for me to trudge down to the dark depths of the cluster room and to manually log in and see what exactly happened to these nodes. I'll post on the list if I find anything interesting -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: [Ipmitool-devel] Is the BMC robust to recover from system hangs? impitool unresponsive
On Mon, Aug 30, 2010 at 7:54 AM, Andy Cress andy.cr...@us.kontron.com wrote: Thanks very much Andy for taking the time for such a detailed response! That sure helps! Yes, that is a key function that all IPMI BMCs are supposed to provide. The BMC is generally not affected by what the OS does, unless there are IPMI-aware applications running in the OS, specifically talking to the BMC. Nothing that I am aware of other than ipmitool. None of the vendor-specific GUIs etc. 1) IPMI LAN configuration. Make sure that the IPMI LAN was properly configured. It sounds like you may have tested this beforehand. Even something like the ARP configuration could cause the port to no longer be visible to the router. Yes, I had tested extensively prior to failure. This is a HPC cluster with about ~300 identical servers and other servers in the group are still responding perfectly. All are on their own dedicated IP subnet although the IMPI physical network is the same as the normal 1GiGE eth network. i.e. IPMI traffic is piggybacking on the same eth adapter port. 3) Some OS-resident (custom?) IPMI-aware application that may be causing trouble/stress/configuration problems with the BMC. Nothing that I can imagine. I'm using CentOS and fairly standard Linux tools. n a healthy system, the 'ps -ef' output on the target should show any ipmi-related processes that are running. I don't see any suspicious processes (on a sister node that hasn't crashed). But here's a ps -ef if anything out of place is evident to you. [r...@eu001 ~]# ps -ef UIDPID PPID C STIME TTY TIME CMD root 1 0 0 Aug29 ?00:00:01 init [3] root 2 1 0 Aug29 ?00:00:00 [migration/0] root 3 1 0 Aug29 ?00:00:00 [ksoftirqd/0] root 4 1 0 Aug29 ?00:00:00 [watchdog/0] root 5 1 0 Aug29 ?00:00:00 [migration/1] root 6 1 0 Aug29 ?00:00:00 [ksoftirqd/1] root 7 1 0 Aug29 ?00:00:00 [watchdog/1] root 8 1 0 Aug29 ?00:00:00 [migration/2] root 9 1 0 Aug29 ?00:00:00 [ksoftirqd/2] root10 1 0 Aug29 ?00:00:00 [watchdog/2] root11 1 0 Aug29 ?00:00:00 [migration/3] root12 1 0 Aug29 ?00:00:00 [ksoftirqd/3] root13 1 0 Aug29 ?00:00:00 [watchdog/3] root14 1 0 Aug29 ?00:00:00 [migration/4] root15 1 0 Aug29 ?00:00:00 [ksoftirqd/4] root16 1 0 Aug29 ?00:00:00 [watchdog/4] root17 1 0 Aug29 ?00:00:00 [migration/5] root18 1 0 Aug29 ?00:00:00 [ksoftirqd/5] root19 1 0 Aug29 ?00:00:00 [watchdog/5] root20 1 0 Aug29 ?00:00:00 [migration/6] root21 1 0 Aug29 ?00:00:00 [ksoftirqd/6] root22 1 0 Aug29 ?00:00:00 [watchdog/6] root23 1 0 Aug29 ?00:00:00 [migration/7] root24 1 0 Aug29 ?00:00:00 [ksoftirqd/7] root25 1 0 Aug29 ?00:00:00 [watchdog/7] root26 1 0 Aug29 ?00:00:00 [events/0] root27 1 0 Aug29 ?00:00:00 [events/1] root28 1 0 Aug29 ?00:00:00 [events/2] root29 1 0 Aug29 ?00:00:00 [events/3] root30 1 0 Aug29 ?00:00:00 [events/4] root31 1 0 Aug29 ?00:00:00 [events/5] root32 1 0 Aug29 ?00:00:00 [events/6] root33 1 0 Aug29 ?00:00:00 [events/7] root34 1 0 Aug29 ?00:00:00 [khelper] root 169 1 0 Aug29 ?00:00:00 [kthread] root 181 169 0 Aug29 ?00:00:00 [kblockd/0] root 182 169 0 Aug29 ?00:00:00 [kblockd/1] root 183 169 0 Aug29 ?00:00:00 [kblockd/2] root 184 169 0 Aug29 ?00:00:00 [kblockd/3] root 185 169 0 Aug29 ?00:00:00 [kblockd/4] root 186 169 0 Aug29 ?00:00:00 [kblockd/5] root 187 169 0 Aug29 ?00:00:00 [kblockd/6] root 188 169 0 Aug29 ?00:00:00 [kblockd/7] root 189 169 0 Aug29 ?00:00:00 [kacpid] root 302 169 0 Aug29 ?00:00:00 [cqueue/0] root 303 169 0 Aug29 ?00:00:00 [cqueue/1] root 304 169 0 Aug29 ?00:00:00 [cqueue/2] root 305 169 0 Aug29 ?00:00:00 [cqueue/3] root 306 169 0 Aug29 ?00:00:00 [cqueue/4] root 307 169 0 Aug29 ?00:00:00 [cqueue/5] root 308 169 0 Aug29 ?00:00:00 [cqueue/6] root 309 169 0 Aug29 ?00:00:00 [cqueue/7] root 312 169 0 Aug29 ?00:00:00 [khubd] root 314 169 0 Aug29 ?00:00:00 [kseriod] root 437 169 0 Aug29 ?00:00:00 [pdflush] root 438 169 0 Aug29 ?00:00:30 [pdflush] root 439
Re: [Ipmitool-devel] Is the BMC robust to recover from system hangs? impitool unresponsive
On Mon, Aug 30, 2010 at 10:23 AM, Jarrod B Johnson jbjoh...@us.ibm.comwrote: Don't know much about Dell specifically, however I'll offer some guidance. Very much appreciate your helping out! I haven't any If the Broadcom part has the tg3 driver, you may be out of luck depending on the failure state. For example, BCM5704 chips fundamentally cannot provide BMC access while executing PXE. On the other hand, bnx2 managed chips tend to fare better, there generally is at least one way to make it work correctly, though drivers and nic firmware matter *greatly* still. Not as resilient as I would like, but with precautions in how you manage firmware and drivers, it's workable. Luckily no tg3 for this server. It does have the bnx2 driver. Do you have any specific driver / firmware version comments about which combinations do work and which don't? You'll want to check your tg3/bnx2/whatever driver version and NIC firmware version, depending on your investigation. I have version 1.9.3 of bnx2. Not sure how to get the NIC firmware version on a running system. Shared nics can work great, but some implementations can be picky about what drivers and firmware are in place. This being a HPC cluster shared nics was the more feasible option. The cost of a dedicated out-of-band network and switches was deemed too expensive and messy. As such no one server is critical but the utility of the BMC+IMPI is the ability to debug crashes without having to walk to the server room each time. Or so I thought! :) Also, newer is not always better, sometimes a developer without caring about the IPMI access provided by some nics will unwittingly break it somehow in the driver, and it won't get fixed until some server vendor or other industrious administrator stumbles across it. Absolutely. Agreed. Unfortunately there's no easy way of knowing what works and what doesn't other than posting on a list like this and hoping someone else has been burnt before! :) Thanks again! -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: IPMI
On Wed, Aug 4, 2010 at 2:33 PM, Alexander Dupuy alex.du...@mac.com wrote: My experience with Dell server systems that have BMC or iDRAC cards standard (9th/10th/11th gen, at least) is that lm_sensors doesn't have any usable sensors to monitor, as Dell have wired them all up to the BMC instead (AMD CPU temperature sensors builtin to the CPU itself perhaps being one exception - but all my Dell servers are Intel). I'm not sure about the hardware details but is it difficult to allow both lm_sensors and the BMC to access the sensors? WHy does Dell block the lm_sensors access? Is there a reason? Maybe someone at Dell can elaborate? -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Recommendations for JBOD to add storage to my R410
I've used MD1000's in the past but that involves buying a PERCe and doing hardware RAID. I am tending towards doing a software RAID config this time by using mdadm and just needed a minimalistic JBOD to put SATA drives in to. The R410 will only take 4 internal drives (and already using one drive for the OS) whereas I need 8 drives. So I do need some external storage. I was planning on using 8, 1-Terabyte SATA drives. Any recommendations for a suitable JBOD model? I couldn't find anything suitable on the website This is going to be dedicated storage for this particular server so SAN, iSCSI, fiber channel etc. are overkills. Performance is not very important since this will be a backup box so my 100 Mbps network link is probably the bottleneck. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: IPMI
On Wed, Aug 4, 2010 at 8:20 AM, James Bensley jwbens...@gmail.com wrote: IPMI is great, but if every server has it it's useless unless you can monitor all you servers from a central place so how are people doing this? What if you monitor in-band using something like pings, heartbeat, ganglia etc.? Then use ipmi (via ipmitool) only when stuff goes wrong and a machine is hung or crashed. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: IPMI
On Wed, Aug 4, 2010 at 12:07 PM, James Bensley jwbens...@gmail.com wrote: On 4 August 2010 17:52, Rahul Nabar rpna...@gmail.com wrote: What if you monitor in-band using something like pings, heartbeat, ganglia etc.? Then use ipmi (via ipmitool) only when stuff goes wrong and a machine is hung or crashed. Not a bad idea but with IPMI waiting till something goes POP is silly, I could have already been using it to see the temperature on my CPUs rising, or the 2nd power supply flapping etc etc... I monitor temperatures via lm_sensors. Again in-band. I try to keep my monitoring in-band unless there is a compelling reason to use ipmi. Maybe some sensors are not available to lm_sensors. Of course, there has to be some aggregation tool whenever you have many servers. But that's a different issue from whether to use in-band (lm_sensors) or out-of-band (ipmi). Personally, both nagios and ganglia have worked well for the aggregation and display. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: IPMI
On Wed, Aug 4, 2010 at 12:19 PM, Rahul Nabar rpna...@gmail.com wrote: What if you monitor in-band using something like pings, heartbeat, ganglia etc.? Then use ipmi (via ipmitool) only when stuff goes wrong and a machine is hung or crashed. I assumed you were using Linux / *nix. Are you? Not sure how good these tools are in a Win environment. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: OSMA does not work on PowerEdge 1435SC
On Mon, Jul 26, 2010 at 8:27 AM, J. Epperson d...@epperson.homelinux.net wrote: Correct. This has been discussed more than once on the list, check the archives. It's unfortunate that there's Dell doc that says the 1435SC is supported for systems management, implying that OMSA should work. I bought a whole cluster of SC1435's 2 years ago under this same false impression. The SC1435 seems to be a black hole in the heirarchy. I often tried out a feature that didn't work only to be eventually told by support that the SC1435 didn;t do it. e.g. large MTU's required for jumbo frames on the eth cards. Never did get that to work on the SC1435 either. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Serial over LAN
On Thu, May 20, 2010 at 1:01 PM, Jefferson Ogata powere...@antibozo.net wrote: 2. In BMC setup (control-E during POST), enable serial over LAN, set IP and password. ** First thing you should do in BMC setup is reset to default. The BMCs often ship with a weird non-default setting that will cause lots of serial port feedback if you try to run a getty on the serial console. Is it possible to do this step in an automated fashion? i.e. without using the Manual Ctrl+E? Would ipmitool, or syscfg or another equivalent tool have the way of setting the BMC SOL? I already set the BMC's IP and pasword using ipmitool. I have ~300 machines so doing this step manually via Ctrl+E is tedious. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: syscfg bug? Does not allow me to set console redirection
On Mon, Jul 12, 2010 at 7:13 AM, vibha_g...@dell.com wrote: Rahul, Syscfg --conred is applicable up to 8th generation of Power Edge servers. On R300 and R410 servers, use syscfg --serialcomm along with --extserial option. Thanks! That works. Is the fact that this option was removed documented somewhere? I don't see it in the Dell manuals for syscfg. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: various FAN on PowerEdge R710
On Sun, Jul 11, 2010 at 5:51 PM, Zhichao Li lzcmich...@gmail.com wrote: Hello all: I am using ipmitool to get some useful information about the system. The outputs for FAN really confuse me as follow: These servers have more than one fan for better cooling. RPM of each fan. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
syscfg bug? Does not allow me to set console redirection options from command line
In order to set the correct BIOS settings for Serial-over-LAN I was using the Dell syscfg too (From the deployment Toolkit). I was hoping to then automate this process on all 300 R410 servers than we have. I tried: /opt/dell/toolkit/bin/syscfg --conred The option 'conred' is not available or cannot be configured through software. I am sure my system has this option since if I physically log in to the bios using a console I can set the redirection. Also the manual (*) for syscfg lists this option so clearly it has to be software addressable. Is this a bug in syscfg? Can someone at Dell verify please? I tried the excellent Dell helpdesk but the guys there had never heard of syscfg nor Serial-over-LAN so it is a losing battle. I am using this version of syscfg: syscfg Version 3.1. ser01 (Linux - Sep 1 2009, 00:30:05) Copyright (c) 2002-2009 Dell Inc. -- Rahul * http://www.google.com/url?sa=tsource=webcd=1ved=0CBIQFjAAurl=http%3A%2F%2Fsupport.dell.com%2Fsupport%2Fedocs%2Fsoftware%2Fdtk%2F2.4%2FCLI%2Fpdf%2Fdtk24cli.pdfei=rCQ5TMnXHIX_ngfB05ikDAusg=AFQjCNHvcMIJUOW55HSMQFJwRsluLuNhFw ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
R410: Is IPMI-BMC always connected to NIC1?
Is the IMPI / BMC IP always connected to the Gig1 port? Normally I use just a single physical ethernet connection but assign a different management IP to the same port so that hung machines etc. can be remotely rebooted. It works mostly ok but: I have a few machines where I am using the NIC2 interface. There it doesn't seem to work. Is the IPMI somehow tied to the Gig1 interface. ANy way to change it so that Gig2 also works? [I'm using Gig2 on some ports because a few of my servers came with the tab on the Gig1 damaged so that the connection remains lose. At that time I didn't think that this was a big deal but this might shut my ipmi options] -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
set BMC to be on a different subnet on a shared ethernet card (R410)
On my R410's I have ethernet adapters that have twin MAC addresses: one for the regular use and the other for a Baseboard Management Controller (IPMI) that can be used to reboot hung machines etc. All my normal eth cards are assigned to a 10.0.x.x network by DHCP. So far so good. Now if I try to set the second IP address to something in the 10.0.x.x range then things work. Say 10.0.5.3. I can ping the same physical server (and the same physical card) on twin IP addresses from any remote machine. arp from the remote pinging machine shows: AddressHWaddress Flags Mask Iface 10.0.0.3 00:26:B9:58:E6:46 C eth0 10.0.4.3 00:26:B9:58:E6:48 C eth0 10.0.0.3 is the normal ethernet IP assigned via DHCP and 10.0.4.3 is for the BMC. But let's say I wanted the BMC to respond on the 172.16.x.x subnet. I can set this address fine. Say, 172.16.0.3 But if I try to ping 172.16.0.3 then there is no response. Surprisingly, arp still shows both addresses but the remote ping does not work. Address HWaddress Flags Mask Iface 10.0.0.3 00:26:B9:58:E6:46 C eth0 172.16.0.3 00:26:B9:58:E6:48 C eth0 Do I have to do something else to make this work? How do I have two subnets on the same adapter? Commands I used: ipmitool lan set 1 ipsrc static ipmitool lan set 1 ipaddr 172.16.0.3 ipmitool lan set 1 netmask 255.255.0.0 -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Dell Power Edge R710 with CentOS
On Tue, May 18, 2010 at 6:46 AM, Stephan van Hienen stephan.van.hie...@thevalley.nl wrote: Just install centos 5.5 and add 'options bnx2 disable_msi=1' to /etc/modprobe.conf after the first boot. If I do modinfo bnx2 I already see: parm: disable_msi:Disable Message Signaled Interrupt (MSI) (int) Do I still have to add the line to modprobe? I tried adding the line and then modprobing bnx2 but modinfo output doesn't seem to change. Is there a way of knowing if or not the fix was successfully applied? -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Dell Power Edge R710 with CentOS
On Fri, May 21, 2010 at 7:46 PM, Matt Domsch matt_dom...@dell.com wrote: (eg. in solaris) Dell has been aware of the issue for months without real fix, just workarounds. I'm not sure what latest means, but we did manage to find the root cause of the failure where the MSI bit would get stuck - which also explains why disabling MSI-X worked around it. The right solution is to use code already in the driver to manage the timeout on that bit automatically, which is what we are testing with 5.5+ and expect in newer RHEL kernels ASAP. I'm really glad I follow this mailing list and hence came to know of this problem. If Dell has been aware of this issue isn't there some way to notify users? I have ~300 R410 systems here and not a word about this. Dell, how do you expect users to find out!? -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Dell Power Edge R710 with CentOS
On Tue, May 18, 2010 at 6:23 AM, Tom Boland t...@t0mb.net wrote: I haven't upgraded any of our boxes from 5.4 yet, but I have to say I've had two big problems with the stock bnx2 NIC drivers from 5.4 (for both R610s and R710s). The first problem was that the NICs would just die randomly when under heavy load, which required a network restart. This in particular can be resolved by adding options bnx2 disable_msi=1 to /etc/modprobe.conf. I have R410's running CentOS 5.4 too. Do you know if they could be having the same issue / fix? I am seeing some suspicious network problems too. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
mixing SAS and SATA drives in a Dell SC1435
I had a power edge SC1435 with a small SAS drive connected to a PERC-i card. I wanted more storage and have bought a large SATA drive. I removed the SAS and connected the SATA to the PERC-i controller. It recognizes it but randomly freezes the disk intermittently. Can't figure out what the problem is. On the other hand the motherboard seems to have two SATA risers on it. Should this be a better option to connect? Also, I read that Dell does not recommend mixing SAS and SATA drives in the SC1435. Does that apply to drives connected to the PERC-i alone? Or can I have a SAS-drive connected to the PERC-i and a SATA connected directly to the motherboard riser? -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Third-party drives not permitted on Gen 11 servers
On Tue, Feb 16, 2010 at 3:35 AM, Tino Schwarze linux-poweredge.li...@tisc.de wrote: Hi there, On Thu, Feb 11, 2010 at 11:52:04AM +0100, Tino Schwarze wrote: I mailed my sales rep yesterday explaining my concerns and got a reply today that he'll ask the marketing department for an official statement. I got an answer today (German, English translation below): If only these were home-grade devices and not enterprise. We'd have to wait on the order of 2 days for some smart kid to come along and tweak a few lines in the firmware that do the check. :) -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Third-party drives not permitted on Gen 11 servers
On Thu, Feb 11, 2010 at 3:50 PM, Andy Krantz an...@digitalcyclone.com wrote: I agree with what everyone else is saying on this subject. I contacted my Dell account manager and they suggested that I post on http://www.ideastorm.com/ I didn't find an existing thread so I started one: http://dellideas.force.com/ideaView?id=0877dwTAAQ +1 for contacting my Sales Rep. I've also posted on the Beowulf list where a lot of us use Dell hardware. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Third-party drives not permitted on Gen 11 servers
On Tue, Feb 9, 2010 at 4:17 PM, howard_sho...@dell.com wrote: There are a number of benefits for using Dell qualified drives in particular ensuring a positive experience and protecting our data. While SAS and SATA are industry standards there are differences which occur in implementation. An analogy is that English is spoken in the UK, US and Australia. While the language is generally the same, there are subtle differences in word usage which can lead to confusion. Sure. But I don't refuse to speak to a person from the UK or Australia do I? Why not learn to work with all the dialects? Besides what's next? Dell certified power-cords?!? Or mouse-pads? -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
R410 stops during boot waiting for a keypress of F1 or F2
On some of my R410 servers the BIOS stops at a message: Press F1 to proceed or F2 to enter System Setup These are clustered machines so such a message just causes the whole automated boot process to halt since they don't have a keyboard or console attached where one presses this key. Is there a way to automate changing this via sysconfg? I found no clue in the Dell Deployment Guide. In the BIOS though there is a setting that chooses whether or not the system halts on the errors. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Dell Toolkit conflicts with Open Manage
On Thu, Feb 4, 2010 at 3:40 PM, Trond Hasle Amundsen t.h.amund...@usit.uio.no wrote: either. The folder /opt/dell Try running 'rpm -qp --scripts dell-toolkit.rpm', and investigate the rpm pre-install scriptlet that complains. Thanks Trond! --scripts is a nice rpm option; didn't know that one. It is the file /etc/omreg.cfg. I guess it is left behind by the uninstaller. I'm going to try and delete it, I doubt anything but open mange needs this file. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: cannot find correct attribute for omconfig BIOS mod
On Mon, Feb 1, 2010 at 9:26 PM, mahavee...@dell.com wrote: Check if the services are running - srvadmin-services.sh status. The dsm_sa_datamgrd needs to be running. Thanks Mahaveer. Everything seems to be running. I even rebooted the machine. What else do I need to check? ./opt/dell/srvadmin/sbin/srvadmin-services.sh status dell_rbu (module) is running ipmi driver is running dsm_sa_datamgrd (pid 4267) is running dsm_sa_eventmgrd (pid 4949) is running dsm_om_shrsvcd (pid 3902) is running dsm_om_connsvcd (pid 4975 4974) is running ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: R410's shipped out with BIOS showing 4 cores instead of 8
On Mon, Feb 1, 2010 at 3:23 AM, Tim Small t...@buttersideup.com wrote: You can use dumpCmos from the smbios utilities to dump all of the BIOS settings from the Linux commandline. Do a diff of the two sets of dumps with the dual-core/quad-core setting having been manually changed... You can then use activateCmosToken to change the same setting on all your servers - which should be trivial to automate. Thanks Tim! What's the exact command again? I see the following on my system,: smbios-get-ut-data smbios-rbu-bios-update smbios-sys-info-lite smbios-wakeup-ctl smbios-lcd-brightness smbios-state-byte-ctl smbios-token-ctl smbios-wireless-ctl smbios-passwd smbios-sys-info smbios-upflag-ctl Am I missing a Dell repo that needs to be installed! I dried dmidecode but cannot figure out the exact setting line I need. I'd also push for financial compensation, I think! I wish! :) Is there Dell management lurking on the list? -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
cannot find correct attribute for omconfig BIOS mod
I was told by Dell support that I could use omconfig system biossetup attribute=cpucore setting=all to change a BIOS setting that restricts my cores to half. Problem is I haven't gotten this to work I get an error message each time: Error! Invalid command: biossetup If I try omconfig chassis biossetup attribute=cpucore setting=all it still throws Error! This BIOS setup feature is not present on this system. Unfortunately I cannot find the attribute cpucore anywhere in the documentation for Open Manage. Is this atribute really present and documented somewhere or am I barking up the wrong tree? -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: cannot find correct attribute for omconfig BIOS mod
On Mon, Feb 1, 2010 at 11:39 AM, Vanush Misha mi...@cs.nuim.ie wrote: useful for you? another one tro try would be omreport chassis biossetup and see what's showing up there. I've used omconfig chassis biossetup to enable CPU Virtualization Technology (cpuvt) and AC power recovery mode (acpwrrecovery) on PE2900 in the past. omconfig chassis biossetup -? Error! BIOS setup feature unavailable on this system. I just don't get it! -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: R410's shipped out with BIOS showing 4 cores instead of 8
On Sun, Jan 31, 2010 at 1:43 PM, Big Wave Dave bigwaved...@gmail.com wrote: -- Rahul We had 16 x R410s show up in mid December with the same issue. Dave Thanks Dave! I'm feeling a little better that I'm not the only one. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: R410's shipped out with BIOS showing 4 cores instead of 8
On Sun, Jan 31, 2010 at 1:43 PM, Big Wave Dave bigwaved...@gmail.com wrote: On Sun, Jan 31, 2010 at 8:05 AM, Rahul Nabar rpna...@gmail.com wrote: Has anyone seen this problem before? I have dual socket Nehalems with twin quad core chips. When I booted the OS it showed only 4 cores. I went to the BIOS and found under Processor Settings the entry Cores-per-processor set to Dual If I change it to All everything is ok again. Thanks guys for helping out! I got the official answer from Dell just a few minutes ago. Apparently the Dell servers are doing this strange BIOS setting by design. I quote my Technical Account Manager below. ## [snip] My apologies we were quite busy today, and I was just about to email you with the results. I have been able to check our tools, and with several technicians. The bios setting by default is set to dual. This is the same for the R710, R410, and R510 servers. As of right now it looks like the only way to change that is by either physically going to each server, and booting into the bios and changing the processor settings. [snip] ### R410's R710 and R510 all are doing this currently (says my TAM). I can't say why but I was told it's a feature not a bug. Very non-intuitive to me. I can't see why an 8 core server defaulting to 4 core makes sense. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Network stability problems on R710 servers and BCM5709
On Sat, Oct 24, 2009 at 5:14 PM, Steve Thompson s...@vgersoft.com wrote: On Sun, 27 Sep 2009, Ryan Pugatch wrote: Evangelos Souglakos wrote: We are experiencing network problems on i/o and network loaded R710 Poweredge servers. Network connectivity dies after some time. The systems needs to be powered down to bring the NICs back to life. Supposedly, loading the bnx2 module with disable_msi=1 resolves this problem. There is a version of the netxtreme driver available that is newer than the one provided by RHEL/CentOS. You can get this from Dell's site. In my experience, using this driver resolves the problem. I have just taken delivery of a T710 w/BCM5709's, and can confirm that the disable_msi=1 trick does *not* resolve this problem; I have had the system hang while idle, 5 minutes after a cold start. Installing the netxtreme driver *does* fix the issue. Not very clever, guys. I have a R410 here. I wonder if they have the same driver. I thought the eth NICs on this one were Broadcoms. -- Rahul ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq