[OpenIndiana-discuss] system hang
just installed oi148 on my old AthlonXP 2600 with 1GB of ram. Had to put -B cpuid_features_edx_exclude='0x4000' on the initial boot, as I used to do when it was running Solaris 10. Three times in the last day, the system has hard hung while pulling a git tree from gitorious.org. I managed to trap a prstat of the last minutes of uptime: PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 988 bent 102M 94M run 00 0:08:20 96% git-index-pack/1 638 root 25M 13M sleep 590 0:00:29 0.9% fmd/27 5 root0K0K sleep 99 -20 0:00:10 0.3% zpool-rpool/136 766 gdm95M 28M sleep 590 0:00:08 0.2% gdm-simple-gree/1 698 root 189M 67M sleep 590 0:00:07 0.1% Xorg/3 987 root 3656K 3148K cpu0590 0:00:00 0.1% prstat/1 531 root 11M 4052K sleep 590 0:00:00 0.0% nscd/28 547 root 8396K 1968K sleep 590 0:00:00 0.0% automountd/4 75 root 14M 7916K sleep 590 0:00:02 0.0% nwamd/11 42 netcfg 4716K 3560K sleep 590 0:00:01 0.0% netcfgd/5 783 bent 13M 4736K sleep 590 0:00:00 0.0% sshd/1 45 root 3032K 1992K sleep 590 0:00:00 0.0% dlmgmtd/4 921 bent 13M 4720K sleep 590 0:00:00 0.0% sshd/1 764 gdm87M 18M sleep 590 0:00:00 0.0% gnome-power-man/1 Ideas? this system was pretty much rock solid for years. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] system hang
Hi Ben, On Wed, Mar 30, 2011 at 09:39, Ben Taylor bentaylor.sol...@gmail.com wrote: just installed oi148 on my old AthlonXP 2600 with 1GB of ram. Had to put -B cpuid_features_edx_exclude='0x4000' on the initial boot, as I used to do when it was running Solaris 10. Three times in the last day, the system has hard hung while pulling a git tree from gitorious.org. you don't define hang - was it pingable (the usual dead vs very, very busy test)? from your prstat output I'd guess yes, but it's only a guess :-) tell us a little more about what you've tried/done so far to find out what's going on. HTH Michael I managed to trap a prstat of the last minutes of uptime: PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 988 bent 102M 94M run 0 0 0:08:20 96% git-index-pack/1 638 root 25M 13M sleep 59 0 0:00:29 0.9% fmd/27 5 root 0K 0K sleep 99 -20 0:00:10 0.3% zpool-rpool/136 766 gdm 95M 28M sleep 59 0 0:00:08 0.2% gdm-simple-gree/1 698 root 189M 67M sleep 59 0 0:00:07 0.1% Xorg/3 987 root 3656K 3148K cpu0 59 0 0:00:00 0.1% prstat/1 531 root 11M 4052K sleep 59 0 0:00:00 0.0% nscd/28 547 root 8396K 1968K sleep 59 0 0:00:00 0.0% automountd/4 75 root 14M 7916K sleep 59 0 0:00:02 0.0% nwamd/11 42 netcfg 4716K 3560K sleep 59 0 0:00:01 0.0% netcfgd/5 783 bent 13M 4736K sleep 59 0 0:00:00 0.0% sshd/1 45 root 3032K 1992K sleep 59 0 0:00:00 0.0% dlmgmtd/4 921 bent 13M 4720K sleep 59 0 0:00:00 0.0% sshd/1 764 gdm 87M 18M sleep 59 0 0:00:00 0.0% gnome-power-man/1 Ideas? this system was pretty much rock solid for years. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss -- regards/mit freundlichen Grüssen Michael Schuster ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] system hang
I'm doing a fairly large git fetch (qt) and it seems to hang while the merge is happening. I'm trying now to do the git fetch nice'd down 15 since it's clear that the git-index-pack is a pretty intensive process. Just a wild guess, but have you tried to run a memory test on this one? Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] system hang
Hi Ben, The first thing I usually try when a hang happens, is loading the kernel debugger (before the hang happens, or course) First, make sure you shut off the graphic console ( svcadm disable gdm) This is a critical step, otherwise the mdb window pops open in Hyperspace and you will not be able to access it, leaving you with the unpleasant option of pulling the plug to restart the machine. Next, you have 2 choices: either edit the boot stanza , or just run mdb -K from one of your login sessions. The boot stanza can be edited (temporarily, the changes are not saved between reboots) by pressing e in the boot menu while the cursor is on the kernel that you want to boot. Here, you would replace , console=graphic with -k -d (and probably delete the splash image line). If the system is able to come up, and you are just debugging some predictable / reproducible hang, the mdb -K method is much easier. Note, it is uppercase K, and do verify that your console is in text mode and. You need to be near the console (ILOM is OK). When you type mdb -K, the console pops into the debugger. At this point, the machine is at a breakpoint, so you need to type :c ie colon c on your console to continue, and let the machine run. Given that you managed to load the debugger, you should be able to break into mdb at will, by pressing a magic key combo on the console. On Sparc, I recall it is ctrl ] On intel, try F1 A, or ctrl-alt-D (as in the letter D) or shift-break Try all of the above, to see which one triggers the debugger for you. shift-break usually works for me. If you are desperate and can not find a key combo that works, another possibility is set up the system for NMI triggered mdb. Most motherboards have an NMI pin (see motherboard docs). If you short this to ground, the mobo generates an NMI (a non-maskable interrrupt). It is common to have a GND (ground) pin right next to this, so effectively you just momentarily connect the 2 pins. You will need the following line in /etc/system to hook up the NMI to trigger the debugger breakpoint: set pcplusmp:apic_kmdb_on_nmi=1 It would be also useful to verify that the machine is configured to save crash dumps ( see man dumpadm). Once your system is set up, get it to hang, and then break into the debugger, and poke around. You may want to intentionally crash the machine at this point, just to generate a crash dump. It can be done a number of ways, an easy one is writing 0 into the (r)ip register and typing :c e.g: rip/w 0 :c It is just easier to work on a crash dump, than on a live system. E.g: generate a ::threadlist -v piped to a file, then pull that up in your favorite editor to see what all the theads are doing. The ::status command will, of course indicate a null pointer de-reference crash do not be thrown by that, since you know you intentionally caused it. best wishes Steve ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss