Hello John, Thanks for your answer. I have open an issue with my hardward manufacturer and so I will do it with my SO one. Anyway I paste the strace listings so maybe someone can shed light on it:
server1: BIOS: American Megatrends Inc. 1.2 SYS: Supermicro X8SIE CPU: Intel(R) Core(TM) i3 CPU 550 @ 3.20GHz [4 cores] MEM: SLOT0 2048 MB SLOT1 2048 MB open("/usr/lib/ruby/1.8/facter/osfamily.rb", O_RDONLY|O_LARGEFILE) = 3 close(3) = 0 open("/usr/lib/ruby/1.8/facter/osfamily.rb", O_RDONLY|O_LARGEFILE) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=800, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7297000 read(3, "# Fact: osfamily\n#\n# Purpose: Re"..., 4096) = 800 ......CRASH server2: BIOS: American Megatrends Inc. 1.2 SYS: Supermicro X8SIE CPU: Intel(R) Core(TM) i3 CPU 560 @ 3.33GHz [4 cores] MEM: SLOT0 2048 MB SLOT1 2048 MB stat64("/usr/sbin/dmidecode", {st_mode=S_IFREG|0755, st_size=48408, ...}) = 0 pipe([3, 4]) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb74e5ba8) = 8709 close(4) = 0 fcntl64(3, F_GETFL) = 0 (flags O_RDONLY) fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb725e000 _llseek(3, 0, 0xbf900930, SEEK_CUR) = -1 ESPIPE(Illegal seek) fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 read(3, "# dmidecode 2.9\nSMBIOS 2.6 prese"..., 1024) = 1024 read(3, "oot is supported\n\t\tBIOS boot spe"..., 1024) = 1024 read(3, "tate: Safe\n\tThermal State: Safe\n"..., 1024) = 1024 read(3, "Maximum Size: 128 KB\n\tSupported "..., 1024) = 1024 read(3, "e 5, 28 bytes\nMemory Controller "..., 1024) = 1024 read(3, " Installed\n\tError Status: OK\n\nHa"..., 1024) = 1024 read(3, " type 8, 9 bytes\nPort Connector "..., 1024) = 1024 read(3, "ternal Reference Designator: LPT"..., 1024) = 1024 read(3, "nal Reference Designator: Not Sp"..., 1024) = 1024 read(3, "nator: Not Specified\n\tExternal C"..., 1024) = 1024 read(3, "or Type: None\n\tPort Type: Other\n"..., 1024) = 1024 read(3, "ector Information\n\tInternal Refe"..., 1024) = 1024 read(3, "\tLength: Short\n\tID: 1\n\tCharacter"..., 1024) = 1024 read(3, "escriptor 5: POST error\n\tData Fo"..., 1024) = 1024 read(3, "ype 19, 15 bytes\nMemory Array Ma"..., 1024) = 1024 read(3, " Width: Unknown\n\tSize: No Module"..., 1024) = 1024 read(3, "ry Device Mapped Address\n\tStarti"..., 1024) = 1024 read(3, "on Handle: Not Provided\n\tTotal W"..., 1024) = 1024 --- SIGCHLD (Child exited) @ 0 (0) --- read(3, "\n\nHandle 0x0039, DMI type 20, 19"..., 1024) = 1024 read(3, "on-recoverable Threshold: 6\n\nHan"..., 1024) = 1024 read(3, "UT OF SPEC>\n\tCooling Unit Group:"..., 1024) = 1024 read(3, "ed: Yes\n\tHot Replaceable: No\n\tCo"..., 1024) = 669 read(3, "", 1024) = 0 close(3) = 0 munmap(0xb725e000, 4096) = 0 rt_sigaction(SIGHUP, {SIG_IGN}, {0xb77388f0, [HUP], SA_RESTART}, 8) = 0 rt_sigaction(SIGQUIT, {SIG_IGN}, {0xb77388f0, [QUIT], SA_RESTART}, 8) = 0 rt_sigaction(SIGINT, {SIG_IGN}, {0xb77388f0, [INT], SA_RESTART}, 8) = 0 waitpid(8709, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 8709 rt_sigaction(SIGHUP, {0xb77388f0, [HUP], SA_RESTART}, {SIG_IGN}, 8) = 0 rt_sigaction(SIGQUIT, {0xb77388f0, [QUIT], SA_RESTART}, {SIG_IGN}, 8) = 0 rt_sigaction(SIGINT, {0xb77388f0, [INT], SA_RESTART}, {SIG_IGN}, 8) = 0 ............ sigprocmask(SIG_SETMASK, [], NULL) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_SETMASK, [], NULL) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 ............. sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_SETMASK, [], NULL) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 ......... sigprocmask(SIG_BLOCK, NULL, []) = 0 sigprocmask(SIG_BLOCK, NULL, []) = 0 .......CRASH 2012/11/26 jcbollinger <john.bollin...@stjude.org> > > > On Thursday, November 22, 2012 6:23:06 AM UTC-6, Mon wrote: >> >> >> >> >> Hello all, >>> >>> We have a problem with puppet and certain kind of machines from our farm >>> (+300), those with Supermicro X8SIE motherboard. Sometime when running >>> puppet the machine crashes, we lose access to it and logging through IPMI >>> doesn't show anything in the console, the only thing we can do is a cold >>> reboot. Then if we run puppet again, nothing happens. If we run puppet >>> several days after it could be another crash or not, it is random. >>> I debugged the problem and got the conclusion that the cause was when >>> running "facter", running it in a mpssh session caused 7 or 8 crashes in >>> different machines. >>> >>> Soft Version: >>> S.O: ubuntu 8.04 >>> facter ** 1.5.4-1ubuntu1 >>> puppet 0.25.1-2 >>> >>> After upgrading to facter -1.6.11-1 crashes continued. (last .deb in >>> puppetlabs to hardy) >>> >>> >> Sorry, I sent before ending....... >> >> I managed to get some traces executing with "strace" that I could paste >> if you consider so. >> >> Someone has experienced something like that? >> >> >> > > > > For what it's worth, Facter itself is unlikely to be crashing your system, > but it runs a variety of commands that probe system details, and it's > possible that one or a combination of those sometimes crashes them. It > should be possible to crash the systems by running the same commands from > the shell. > > If you have straces of facter sessions that resulted in crashes then they > might be illuminating. The key thing I would be looking for is what > commands Facter is trying to run when the crashes occurred. Unfortunately, > the nature of the problem precludes being certain that the last thing in > the captured trace is actually the thing Facter was trying to do when the > crash happened. > > If there is a software bug then it is probably in a separate tool or in > the OS kernel. It might also be that you have a firmware (i.e. BIOS) bug > on the affected systems, or even that the particular motherboard model that > is affected has a design or fabrication flaw. > > > John > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/puppet-users/-/uRikgvYaJN8J. > > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/puppet-users?hl=en. > -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.