Re: [osol-discuss] #top - Segmentation Fault
Thanks Jurgen. This will be fixed in build 127. -Norm Jürgen Keil wrote: I think I can reproduce something when I have the following process running, and run env LD_PRELOAD=libumem.so.1 UMEM_OPTIONS=backend=mmap UMEM_DEBUG=firewall=1 top I've filed http://defect.opensolaris.org/bz/show_bug.cgi?id=12124 I tried to fix this "top" bug, and have attached new top binaries to the above bug. Would that new top binary fix the bug for you? ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
> I think I can reproduce something when I have the > following process running, and run > > env LD_PRELOAD=libumem.so.1 > UMEM_OPTIONS=backend=mmap UMEM_DEBUG=firewall=1 top > > I've filed > http://defect.opensolaris.org/bz/show_bug.cgi?id=12124 I tried to fix this "top" bug, and have attached new top binaries to the above bug. Would that new top binary fix the bug for you? -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
> It happened again today. Here is pflags and pstack ... > # pstack top-14186 > core 'top-14186' of 14186: top > 0040dac0 hash_lookup_pidthr () + 40 > 00414eff getptable () + 35f > 00411efd get_process_info () + 6d > 0040f2cd main () + 33d > 00408a0c () Someone else reported a similar top segmentation fault some months ago: http://www.opensolaris.org/jive/message.jspa?messageID=372954 Are you running a process on this machine that creates (and probably terminates) *lots* of threads? That is, is there a multithreaded process running on this machine with lwp ids > 214748 ? I think I can reproduce something when I have the following process running, and run env LD_PRELOAD=libumem.so.1 UMEM_OPTIONS=backend=mmap UMEM_DEBUG=firewall=1 top I've filed http://defect.opensolaris.org/bz/show_bug.cgi?id=12124 #include #include #include void * func(void *arg) { } void * func2(void *arg) { pause(); } main() { int i; int err; pthread_t tid; void *status; for (i = 0; i < 25; i++) { err = pthread_create(&tid, NULL, func, NULL); if (err) { fprintf(stderr, "pthread create: %s\n", strerror(err)); exit(1); } pthread_join(tid, &status); if (i % 1000 == 0) printf("%d\n", i); } err = pthread_create(&tid, NULL, func2, NULL); pause(); } -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
> It's not bad habit on Solaris. top is somewhat bad > habit when you have a lot of users on some system > using it in one time ;-) > > http://www.brendangregg.com/DTrace/prstatvstop.html ISTR reading that "top" does what it does intentionally - so that it can handle more processes than the per-process file descriptor limit. (What's not said is that when the number of processes get high enough, "top" scales badly enough that being _potentially_ able to handle an arbitrary number of processes may not be worth as much as it sounds like.) Something else I saw in the link above, about "sync; sync; sync; reboot" being an "admin myth": back in v7 Unix, there wasn't really a reboot command, you just used some means outside the OS to break back to firmware (or more likely, a smart console, so that you could use your DECwriter rather than toggle switches). In that situation, where there is no "reboot" to flush and quiesce everything, one needed to: * stop all processes other than the single-user shell (and init) * sync * wait awhile, since sync only schedules flushing of buffers, but doesn't actually wait for it to complete * halt or reboot by whatever external means Three "sync"s were about as good as one sync and wait five seconds or so. So while it wasn't strictly necessary (one sync followed by a few seconds waiting and/or listening for the washing machine sided drives to stop thumping around, if the machine room wasn't too noisy to tell, would also have done the job), it was certainly better than sync immediately followed by whatever key sequence would escape from the OS. Nowadays however, where "reboot" is by default clean, I very much doubt it serves any useful purpose. So I wouldn't so much call it a myth as a poorly understood practice that survives in tradition long past its useful time. Although some might argue that many other sorts of myths have a similarly disconnected basis in fact, making my distinction a bit pointless... -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
It happened again today. Here is pflags and pstack # pflags -r top-14186 core 'top-14186' of 14186: top data model = _LP64 flags = MSACCT|MSFORK /1:flags = 0 sigmask = 0xbefc,0x cursig = SIGSEGV %r15 = 0x00450F20 %r14 = 0x004495A0 %r13 = 0x00450DF0 %r12 = 0x004583D0 %r11 = 0x000300C6 %r10 = 0x %r9 = 0x000A186D %r8 = 0x0021 %rdi = 0x0042AA00 %rsi = 0x000A186D17B8 %rbp = 0xFD7FFFDFF430 %rbx = 0x0002 %rdx = 0xFDBF %rcx = 0x080F %rax = 0x %trapno = 0x000E %err = 0x0004 %rip = 0x0040DAC0 %cs = 0x0053 %rfl = 0x00010206 %rsp = 0xFD7FFFDFF420 %ss = 0x004B %fs = 0x %gs = 0x %es = 0x %ds = 0x %fsbase = 0xFD7FFF152A00 %gsbase = 0x # pstack top-14186 core 'top-14186' of 14186: top 0040dac0 hash_lookup_pidthr () + 40 00414eff getptable () + 35f 00411efd get_process_info () + 6d 0040f2cd main () + 33d 00408a0c () -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
Thanks. The funny thing is I run top many times today but never get seg fault anymore. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
It's not bad habit on Solaris. top is somewhat bad habit when you have a lot of users on some system using it in one time ;-) http://www.brendangregg.com/DTrace/prstatvstop.html -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
Some of these tools can answer your question : http://www.opensolaris.org/os/community/observability/ -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
> Thanks. It didn't coredump. Hmm, interesting... I just tried to run top in a window and ran "pkill -SEGV top" in another window; the shell reports "Segmentation fault", but the kernel doesn't write a core dump. Reason is that top changes it's current directory to /proc, and the kernel can't write core dumps to /proc. > What other info do you need to help diagnose this issue? Setup the system to write global core dumps, and reproduce the top segmentation fault. Something like this: # mkdir /cores # coreadm -g /cores/%f-%p # coreadm -e global # coreadm -e log Now run "top"; when it crashes with segmentation fault you should find a top- core dump file in the /cores directory. To get an idea where and why it might be crashing, pflags and pstack output for that core dump is needed. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
Thanks. It didn't coredump. I bearly use top, only when I saw this post and tried it. My system was originally snv_117, then luupgrade to snv_124 on every release, but as I said, never tried it before, so I don't know if it only happens to this version. What other info do you need to help diagnose this issue? -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
I use it all of the time on lots of different systems and it has been working just fine. Can you send me or point me to a core file? I'd like to help you, but you have to meet me somewhere closer to halfway here. -Norm Chris Du wrote: Got it on snv_124 too. ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
Got it on snv_124 too. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
> I get the same on opensolaris snv_111b Please file a bug at http://defect.opensolaris.org and make sure to attach pstack and pflags output for top's core dump. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
I get the same on opensolaris snv_111b It has worked fine for me, the server has been running for a couple months and i havent been on it for a while, but its a clean install (development server) and now top gets a segmentation fault. I havent tried rebooting it yet to see if it starts working again. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
> The only real problem with top is that it doesn't > show data for 64 bit binaries properly. Should be no problem; on a 64-bit system top isaexecs /bin/amd64/top, and that shows data for 64-bit processes just fine. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
Greets, Same here on pair of Thor's (X4540) running OpenSolaris snv_117: # uname -a SunOS glenlock 5.11 snv_117 i86pc i386 i86pc It goes away after reboot, and then comes back some time later, "pkg verify SUNWtop" didn't find anything wrong. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
> >> HOWEVER ! I really think you need to NOT use top. > >> > >> Use prstat to get what you need. Just look at the man page and never use > >> top on Solaris if you can avoid it. > >> > >> > > Ok, I didn't know that the top is a bad habit on solaris :) > > It isn't a bad habit. That's merely Dennis's opinion. FWIW, top works > just fine for me (and has since it was integrated). I use it all the > time (it's what I'm used to, coming from other OS's that don't have > prstat). As usual, the best is to be familiar with both top and prstat. top is good because it gives you a "quick and dirty overview" over cpu and memory load. And it runs on many platforms with an identical UI. The only real problem with top is that it doesn't show data for 64 bit binaries properly. prstat is good because you can see LWPs and zone-specific statistics. But it's difficult to see total and assigned memory, and the user interface could use some improvements. :-) > All that said, I can't really add anything useful on why it's core'ing > for you. You could truss it's execution and see if anything strange > happens before it cores. Alternatively, inspect the core file for clues > and troll the bug database for any likely candidates. Is this on > OpenSolaris? If so, pkg verify SUNWtop would be good to run to make > sure the package is installed properly. Agree with Glenns helpful suggestions. But first of all, give us some more info about your system: OpenSolaris or SXCE? Where did you get the "top" binary? Does it actually write a core file? Good luck -- Volker -- Volker A. Brandt Consulting and Support for Sun Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgröße: 45 Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
* Roman Naumenko (ro...@frontline.ca) wrote: > Dennis Clarke wrote, On 28.09.2009 16:57: >>> The top command gives "Segmentation fault" >>> >>> # top >>> Segmentation Fault >>> >>> #uname -a >>> SunOS zsan02 5.11 snv_118 i86pc i386 i86pc Solaris >>> >>> Any idea why it might happen? >>> >> >> You may or may not get a core dump from that. >> >> HOWEVER ! I really think you need to NOT use top. >> >> Use prstat to get what you need. Just look at the man page and never use >> top on Solaris if you can avoid it. >> >> > Ok, I didn't know that the top is a bad habit on solaris :) It isn't a bad habit. That's merely Dennis's opinion. FWIW, top works just fine for me (and has since it was integrated). I use it all the time (it's what I'm used to, coming from other OS's that don't have prstat). All that said, I can't really add anything useful on why it's core'ing for you. You could truss it's execution and see if anything strange happens before it cores. Alternatively, inspect the core file for clues and troll the bug database for any likely candidates. Is this on OpenSolaris? If so, pkg verify SUNWtop would be good to run to make sure the package is installed properly. Cheers, -- Glenn ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
Dennis Clarke wrote, On 28.09.2009 16:57: The top command gives "Segmentation fault" # top Segmentation Fault #uname -a SunOS zsan02 5.11 snv_118 i86pc i386 i86pc Solaris Any idea why it might happen? You may or may not get a core dump from that. HOWEVER ! I really think you need to NOT use top. Use prstat to get what you need. Just look at the man page and never use top on Solaris if you can avoid it. Ok, I didn't know that the top is a bad habit on solaris :) -- Roman ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
Personally, I can't say that I've had it dump core for me. Perhaps pstack(1) or mdb(1) might give some idea where/why it's dumping core. -Norm Roman Naumenko wrote: The top command gives "Segmentation fault" # top Segmentation Fault #uname -a SunOS zsan02 5.11 snv_118 i86pc i386 i86pc Solaris Any idea why it might happen? -- Roman ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] #top - Segmentation Fault
> The top command gives "Segmentation fault" > > # top > Segmentation Fault > > #uname -a > SunOS zsan02 5.11 snv_118 i86pc i386 i86pc Solaris > > Any idea why it might happen? You may or may not get a core dump from that. HOWEVER ! I really think you need to NOT use top. Use prstat to get what you need. Just look at the man page and never use top on Solaris if you can avoid it. -- Dennis Clarke dcla...@opensolaris.ca <- Email related to the open source Solaris dcla...@blastwave.org <- Email related to open source for Solaris ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
[osol-discuss] #top - Segmentation Fault
The top command gives "Segmentation fault" # top Segmentation Fault #uname -a SunOS zsan02 5.11 snv_118 i86pc i386 i86pc Solaris Any idea why it might happen? -- Roman -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org