All,

After bandaiding my server back together by putting a 4-port pci-sata controller in it to work around the failed onboard disk controller, the system is up and running fine. In the BIOS, currently the onboard sata controller is 'Enabled', but each of the sata ports is 'Disabled'. When I check the status of something with systemclt, I get an odd error at the end of each command, eg:

[15:47 phoinix:~/.ssh] # sc status smbd
● smbd.service - Samba SMB/CIFS server
Loaded: loaded (/usr/lib/systemd/system/smbd.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2015-08-22 22:57:26 CDT; 1 day 16h ago
 Main PID: 542 (smbd)
   CGroup: /system.slice/smbd.service
           ├─542 /usr/bin/smbd -D
           └─559 /usr/bin/smbd -D
Bus error (core dumped)

Looking at the journal and looking at the core dumps, the only other process that is implicated is:

Cannot add dependency job for unit cups.socket, ignoring: Unit cups.socket failed to load: No such file or directory.

Nothing else is generating a core dump. But each time I check the status of a process, it ends with:

Bus error (core dumped)

  The only other thing I see in the journal that may or may not be related is:

Aug 24 14:21:58 phoinix systemd[13187]: pam_unix(systemd-user:session): session opened for user root by (uid=0) Aug 24 14:21:58 phoinix systemd[13187]: Unit type .busname is not supported on this system.

I don't know if that's related, but it was the only thing else tangentially related to 'bus'.

  Looking at the core dump list with 'coredumpctl list' show a handful of files:

[17:46 phoinix:~/.ssh] # coredumpctl list
TIME                            PID   UID   GID SIG PRESENT EXE
Mon 2015-04-06 19:00:15 CDT     342     0     0  11   /usr/bin/cupsd
Tue 2015-05-26 13:15:01 CDT   23265     0     0  11   /usr/bin/crond
Tue 2015-05-26 14:01:01 CDT   23563     0     0  11   /usr/bin/crond
Tue 2015-05-26 14:05:01 CDT   23593     0     0  11   /usr/bin/crond
Sun 2015-08-23 05:51:43 CDT    3151     0     0   7 * /usr/bin/systemctl
Sun 2015-08-23 05:52:16 CDT    3179     0     0   7 * /usr/bin/systemctl
Sun 2015-08-23 07:11:33 CDT    3639     0     0   7 * /usr/bin/systemctl
Sun 2015-08-23 07:12:31 CDT    3652     0     0   7 * /usr/bin/systemctl
Mon 2015-08-24 15:30:11 CDT   13565     0     0   7 * /usr/bin/systemctl
Mon 2015-08-24 15:32:05 CDT   13580     0     0   7 * /usr/bin/systemctl
Mon 2015-08-24 15:53:37 CDT   13696     0     0   7 * /usr/bin/systemctl

  Looking at the dumps in gdb shows:

[17:47 phoinix:~/.ssh] # coredumpctl gdb 13696
           PID: 13696 (systemctl)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 7 (BUS)
     Timestamp: Mon 2015-08-24 15:53:37 CDT (1h 54min ago)
  Command Line: systemctl status smbd
    Executable: /usr/bin/systemctl
 Control Group: /user.slice/user-1000.slice/session-c2.scope
          Unit: session-c2.scope
         Slice: user-1000.slice
       Session: c2
     Owner UID: 1000 (david)
       Boot ID: aeecdf7479ea4b43aae7f1b9b83b2502
    Machine ID: 8d32bcc3152b4a1f87c4d71f948f93fb
      Hostname: phoinix
Coredump: /var/lib/systemd/coredump/core.systemctl.0.aeecdf7479ea4b43aae7f1b9b83b2502.13696.1440449617000000.lz4
       Message: Process 13696 (systemctl) of user 0 dumped core.
<snip>
(gdb) bt
#0  0x00007f353981becf in ?? ()
#1  0x00007f3539801c09 in ?? ()
#2  0x00007f3539801d38 in ?? ()
#3  0x00007f3539801b64 in ?? ()
#4  0x00007f3539801d38 in ?? ()
#5  0x00007f3539801b64 in ?? ()
#6  0x00007f353980310e in ?? ()
#7  0x00007f35397f4080 in ?? ()
#8  0x00007f353983340b in ?? ()
#9  0x00007f35397ed1d1 in ?? ()
#10 0x00007f35397e2414 in ?? ()
#11 0x00007f35386f5790 in __libc_start_main () from /usr/lib/libc.so.6
#12 0x00007f35397e3049 in ?? ()
(gdb) frame 0
#0  0x00007f353981becf in ?? ()
(gdb) info frame
Stack level 0, frame at 0x7ffed3907080:
 rip = 0x7f353981becf; saved rip = 0x7f3539801c09
 called by frame at 0x7ffed3907160
 Arglist at 0x7ffed3906fd8, args:
 Locals at 0x7ffed3906fd8, Previous frame's sp is 0x7ffed3907080
 Saved registers:
rbx at 0x7ffed3907048, rbp at 0x7ffed3907050, r12 at 0x7ffed3907058, r13 at 0x7ffed3907060, r14 at 0x7ffed3907068,
  r15 at 0x7ffed3907070, rip at 0x7ffed3907078
(gdb) quit

I haven't seen or noticed this happening before, but obviously the first core dump was back in April related to cups. The question is "What should I check?" and "Does any of this look related to BIOS settings and the new disk controller?" (that looks more doubtful after looking over all the information)

  Anybody have experience with this type thing?

--
David C. Rankin, J.D.,P.E.

Reply via email to