Le 10/05/2010 19:14, Jim Kusznir a écrit :
Hi all:

I've got a couple Ubuntu 9.10 machines that are suffering from a
recurring failure of winbind that essentially crash the machine.  When
the system is in the "crashed state", one can ping the system, but all
forms of login fail.
It's normal, winbind don't works anymore, so all services using pam are out of service.
It will not even respond to tftpd requests; ssh
connections "time out", but the initial port is opened (just no
connect).  Rebooting does NOT recover from this, in order to recover,
I need to:

1) reboot into single user mode
Have you enough place on your partitions at this step ?
2) edit /etc/nsswitch.conf and remove winbind
3) remove winbind from all pam.d/*
4) boot normally
5) stop samba and winbind
6) delete /var/lib/samba/* and /var/cache/samba/*
7) start samba
8) rejoin doimain
9) start winbind
10) undo #2 and 3 above

After this, winbind will work for a week or two.  If I stop after step
4 above the system is usable, but without domain users able to log in.
  My diagnostics show that net ads users (and all other "samba"
commands) work just fine and find all users.  All winbind-specific
commands (wbinfo -u, etc) fail.  Oh, if I leave the system up in the
crashed state, it begins to fill up logs to the tune of 32gigs in a
few days.  The above procedure repeats approximately once every 5 days
on our main production system.  I have a second workstation that sees
very little use, and it has suffered the same crash, but far less
frequently.  I have also tried inserting step 6.5 where I delete the
machine account on the DC, but that doesn't change anything.  Also,
our Ubuntu 9.04 system running the same configuration files has no
issues.  We have not tried 10.04.

This problem has been plaguing our operations for over two months now,
so any assistance would be greatly appreciated.

Some log file snippits:

(from some point "in the middle" of the crash):
May  7 15:32:45 casas-lin winbindd[20677]:   sys_select: pipe failed
(Too many open files)
"Too many open files" means your system has reach the limit of open files

try tu use lsof command to see which process open too many files.

lsof|wc -l

to see how many files are open

lsof|less

to see all open files

cat /proc/sys/fs/file-max

to see the system limit

May  7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45,  0]
lib/events.c:287(s3_event
_debug)
May  7 15:32:45 casas-lin winbindd[20677]:   s3_event: sys_select()
failed: 24:Too many open f
iles
May  7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45,  0]
lib/select.c:64(sys_selec
t)
May  7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45,  0]
lib/debug.c:663(reopen_lo
gs)
May  7 15:32:45 casas-lin winbindd[20677]:   Unable to open new log
file /var/log/samba/log.wb
-CASAS: Too many open files
------
 From startup (step 4 above):
May 10 08:36:50 casas-lin kernel: May 10 08:38:42 casas-lin
winbindd[1571]: [2010/05/10 08:38:
42,  0] libsmb/smb_signing.c:255(signing_good)
May 10 08:38:42 casas-lin winbindd[1571]:   signing_good: BAD SIG: seq 41
May 10 08:42:25 casas-lin winbindd[1562]: [2010/05/10 08:42:25,  0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:42:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1571 is n
ot responding. Closing connection to it.
May 10 08:42:25 casas-lin winbindd[1571]: [2010/05/10 08:42:25,  0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)
May 10 08:42:25 casas-lin winbindd[1571]:   Got sig[15] terminate (is_parent=0)
May 10 08:42:25 casas-lin winbindd[1825]: [2010/05/10 08:42:25,  0]
rpc_client/cli_pipe.c:687(
cli_pipe_verify_schannel)
May 10 08:42:25 casas-lin winbindd[1825]:   cli_pipe_verify_schannel:
auth_len 56.
May 10 08:43:37 casas-lin winbindd[1825]: [2010/05/10 08:43:37,  0]
libsmb/smb_signing.c:255(s
igning_good)
May 10 08:43:37 casas-lin winbindd[1825]:   signing_good: BAD SIG: seq 23
May 10 08:47:25 casas-lin winbindd[1562]: [2010/05/10 08:47:25,  0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:47:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1825 is n
ot responding. Closing connection to it.
May 10 08:47:25 casas-lin winbindd[1825]: [2010/05/10 08:47:25,  0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)
May 10 08:47:25 casas-lin winbindd[1825]:   Got sig[15] terminate (is_parent=0)
May 10 08:47:25 casas-lin winbindd[1832]: [2010/05/10 08:47:25,  0]
rpc_client/cli_pipe.c:687(
cli_pipe_verify_schannel)
May 10 08:47:25 casas-lin winbindd[1832]:   cli_pipe_verify_schannel:
auth_len 56.
May 10 08:48:38 casas-lin winbindd[1832]: [2010/05/10 08:48:38,  0]
libsmb/smb_signing.c:255(s
igning_good)
May 10 08:48:38 casas-lin winbindd[1832]:   signing_good: BAD SIG: seq 23
May 10 08:52:25 casas-lin winbindd[1562]: [2010/05/10 08:52:25,  0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:52:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1832 is n
ot responding. Closing connection to it.
May 10 08:52:25 casas-lin winbindd[1832]: [2010/05/10 08:52:25,  0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)

---------
log.wb-CASAS (my domain is CASAS.WSU.EDU)
[2010/05/10 09:12:26,  1] libsmb/clikrb5.c:697(ads_krb5_mk_req)
   ads_krb5_mk_req: krb5_get_credentials failed for a...@casas (KDC
reply did not match expectations)
[2010/05/10 09:12:26,  1] libsmb/cliconnect.c:745(cli_session_setup_kerberos)
   cli_session_setup_kerberos: spnego_gen_negTokenTarg failed: KDC
reply did not match expectations
[2010/05/10 09:12:26,  0] rpc_client/cli_pipe.c:687(cli_pipe_verify_schannel)
   cli_pipe_verify_schannel: auth_len 56.
[2010/05/10 09:12:26,  1]
rpc_client/cli_pipe.c:948(cli_pipe_validate_current_pdu)
   cli_pipe_validate_current_pdu: RPC fault code DCERPC fault
0x00000721 received from host ad1.casas.wsu.edu!
-------
log-wb-CASAS.old (during "crashed state"):
[2010/04/19 08:17:23,  1] libsmb/clikrb5.c:697(ads_krb5_mk_req)
   ads_krb5_mk_req: krb5_get_credentials failed for a...@casas (Cannot
resolve network address
for KDC in requested realm)
[2010/04/19 08:17:23,  1] libsmb/cliconnect.c:745(cli_session_setup_kerberos)
   cli_session_setup_kerberos: spnego_gen_negTokenTarg failed: Cannot
resolve network address f
or KDC in requested realm
[2010/04/19 08:17:23,  0] rpc_client/cli_pipe.c:687(cli_pipe_verify_schannel)
   cli_pipe_verify_schannel: auth_len 56.
[2010/04/19 08:17:23,  1]
rpc_client/cli_pipe.c:948(cli_pipe_validate_current_pdu)
   cli_pipe_validate_current_pdu: RPC fault code DCERPC fault
0x00000721 received from host ad1
.casas.wsu.edu!
------------
My configuration
------------
smb.conf
------------
[global]
         security = ads
         netbios name = casas-lin
         realm = CASAS.WSU.EDU
        workgroup = CASAS
         password server = ad1.casas.wsu.edu
         workgroup = CASAS
         idmap uid = 10000-20000
         idmap gid = 10000-20000
        idmap backend = rid:CASAS.WSU.EDU=10000-20000
         winbind enum users = yes
         winbind enum groups = yes
         winbind use default domain = yes
         #template homedir = /home/%U
         template homedir = /net/files/home/%U
         template shell = /bin/bash
;        client use spnego = yes
         domain master = no
--------------
/etc/krb5.conf
-------------
[logging]
  default =FILE:/var/log/krb5libs.log
  kdc =FILE:/var/log/krb5kdc.log
  admin_server =FILE:/var/log/kadmind.log

[libdefaults]
  default_realm = CASAS.WSU.EDU
  dns_lookup_realm = false
  dns_lookup_kdc = true
  ticket_lifetime = 24h
  forwardable = yes

[realms]
  EXAMPLE.COM = {
   kdc = kerberos.example.com:88
   admin_server = kerberos.example.com:749
   default_domain = example.com
  }

  CASAS.WSU.EDU = {
   kdc = ad1.casas.wsu.edu
   admin_server = ad1.casas.wsu.edu
   kdc = ad1.casas.wsu.edu
  }

  CASAS = {
   kdc = ad1.casas.wsu.edu
   admin_server = ad1.casas.wsu.edu
   kdc = ad1.casas.wsu.edu
  }

[domain_realm]
  .example.com = EXAMPLE.COM
  example.com = EXAMPLE.COM

  casas.wsu.edu = CASAS.WSU.EDU
  .casas.wsu.edu = CASAS.WSU.EDU
[appdefaults]
  pam = {
    debug = false
    ticket_lifetime = 36000
    renew_lifetime = 36000
    forwardable = true
    krb4_convert = false
  }
---------------
/etc/pam.d/common-account
---------------
account [success=1 new_authtok_reqd=done default=ignore]        pam_unix.so
account requisite                       pam_deny.so
account required                        pam_permit.so
account sufficient                      pam_winbind.so
account required                        pam_krb5.so minimum_uid=1000
------------
/etc/pam.d/common-auth
------------
auth    [success=3 default=ignore]      pam_winbind.so krb5_auth 
krb5_ccache_type=FILE
auth    [success=2 default=ignore]      pam_krb5.so minimum_uid=1000 
try_first_pass
auth    [success=1 default=ignore]      pam_unix.so nullok_secure try_first_pass
auth    requisite                       pam_deny.so
auth    required                        pam_permit.so
------------
/etc/pam.d/common-password
------------
password        requisite                       pam_winbind.so
password        requisite                       pam_krb5.so minimum_uid=1000 
use_authtok
password        [success=1 default=ignore]      pam_unix.so obscure use_authtok
try_first_pass sha512
password        requisite                       pam_deny.so
password        required                        pam_permit.so
password        optional        pam_gnome_keyring.so
-------------
/etc/nsswitch.conf
-------------
passwd:         compat winbind
group:          compat winbind
shadow:         compat

hosts:          files dns mdns4
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis
----------------

Thanks!
--Jim
--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

Reply via email to