Le 10/05/2010 19:14, Jim Kusznir a écrit :
Hi all:
I've got a couple Ubuntu 9.10 machines that are suffering from a
recurring failure of winbind that essentially crash the machine. When
the system is in the "crashed state", one can ping the system, but all
forms of login fail.
It's normal, winbind don't works anymore, so all services using pam are
out of service.
It will not even respond to tftpd requests; ssh
connections "time out", but the initial port is opened (just no
connect). Rebooting does NOT recover from this, in order to recover,
I need to:
1) reboot into single user mode
Have you enough place on your partitions at this step ?
2) edit /etc/nsswitch.conf and remove winbind
3) remove winbind from all pam.d/*
4) boot normally
5) stop samba and winbind
6) delete /var/lib/samba/* and /var/cache/samba/*
7) start samba
8) rejoin doimain
9) start winbind
10) undo #2 and 3 above
After this, winbind will work for a week or two. If I stop after step
4 above the system is usable, but without domain users able to log in.
My diagnostics show that net ads users (and all other "samba"
commands) work just fine and find all users. All winbind-specific
commands (wbinfo -u, etc) fail. Oh, if I leave the system up in the
crashed state, it begins to fill up logs to the tune of 32gigs in a
few days. The above procedure repeats approximately once every 5 days
on our main production system. I have a second workstation that sees
very little use, and it has suffered the same crash, but far less
frequently. I have also tried inserting step 6.5 where I delete the
machine account on the DC, but that doesn't change anything. Also,
our Ubuntu 9.04 system running the same configuration files has no
issues. We have not tried 10.04.
This problem has been plaguing our operations for over two months now,
so any assistance would be greatly appreciated.
Some log file snippits:
(from some point "in the middle" of the crash):
May 7 15:32:45 casas-lin winbindd[20677]: sys_select: pipe failed
(Too many open files)
"Too many open files" means your system has reach the limit of open files
try tu use lsof command to see which process open too many files.
lsof|wc -l
to see how many files are open
lsof|less
to see all open files
cat /proc/sys/fs/file-max
to see the system limit
May 7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45, 0]
lib/events.c:287(s3_event
_debug)
May 7 15:32:45 casas-lin winbindd[20677]: s3_event: sys_select()
failed: 24:Too many open f
iles
May 7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45, 0]
lib/select.c:64(sys_selec
t)
May 7 15:32:45 casas-lin winbindd[20677]: [2010/05/07 15:32:45, 0]
lib/debug.c:663(reopen_lo
gs)
May 7 15:32:45 casas-lin winbindd[20677]: Unable to open new log
file /var/log/samba/log.wb
-CASAS: Too many open files
------
From startup (step 4 above):
May 10 08:36:50 casas-lin kernel: May 10 08:38:42 casas-lin
winbindd[1571]: [2010/05/10 08:38:
42, 0] libsmb/smb_signing.c:255(signing_good)
May 10 08:38:42 casas-lin winbindd[1571]: signing_good: BAD SIG: seq 41
May 10 08:42:25 casas-lin winbindd[1562]: [2010/05/10 08:42:25, 0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:42:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1571 is n
ot responding. Closing connection to it.
May 10 08:42:25 casas-lin winbindd[1571]: [2010/05/10 08:42:25, 0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)
May 10 08:42:25 casas-lin winbindd[1571]: Got sig[15] terminate (is_parent=0)
May 10 08:42:25 casas-lin winbindd[1825]: [2010/05/10 08:42:25, 0]
rpc_client/cli_pipe.c:687(
cli_pipe_verify_schannel)
May 10 08:42:25 casas-lin winbindd[1825]: cli_pipe_verify_schannel:
auth_len 56.
May 10 08:43:37 casas-lin winbindd[1825]: [2010/05/10 08:43:37, 0]
libsmb/smb_signing.c:255(s
igning_good)
May 10 08:43:37 casas-lin winbindd[1825]: signing_good: BAD SIG: seq 23
May 10 08:47:25 casas-lin winbindd[1562]: [2010/05/10 08:47:25, 0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:47:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1825 is n
ot responding. Closing connection to it.
May 10 08:47:25 casas-lin winbindd[1825]: [2010/05/10 08:47:25, 0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)
May 10 08:47:25 casas-lin winbindd[1825]: Got sig[15] terminate (is_parent=0)
May 10 08:47:25 casas-lin winbindd[1832]: [2010/05/10 08:47:25, 0]
rpc_client/cli_pipe.c:687(
cli_pipe_verify_schannel)
May 10 08:47:25 casas-lin winbindd[1832]: cli_pipe_verify_schannel:
auth_len 56.
May 10 08:48:38 casas-lin winbindd[1832]: [2010/05/10 08:48:38, 0]
libsmb/smb_signing.c:255(s
igning_good)
May 10 08:48:38 casas-lin winbindd[1832]: signing_good: BAD SIG: seq 23
May 10 08:52:25 casas-lin winbindd[1562]: [2010/05/10 08:52:25, 0]
winbindd/winbindd_dual.c:1
86(async_request_timeout_handler)
May 10 08:52:25 casas-lin winbindd[1562]:
async_request_timeout_handler: child pid 1832 is n
ot responding. Closing connection to it.
May 10 08:52:25 casas-lin winbindd[1832]: [2010/05/10 08:52:25, 0]
winbindd/winbindd.c:190(wi
nbindd_sig_term_handler)
---------
log.wb-CASAS (my domain is CASAS.WSU.EDU)
[2010/05/10 09:12:26, 1] libsmb/clikrb5.c:697(ads_krb5_mk_req)
ads_krb5_mk_req: krb5_get_credentials failed for a...@casas (KDC
reply did not match expectations)
[2010/05/10 09:12:26, 1] libsmb/cliconnect.c:745(cli_session_setup_kerberos)
cli_session_setup_kerberos: spnego_gen_negTokenTarg failed: KDC
reply did not match expectations
[2010/05/10 09:12:26, 0] rpc_client/cli_pipe.c:687(cli_pipe_verify_schannel)
cli_pipe_verify_schannel: auth_len 56.
[2010/05/10 09:12:26, 1]
rpc_client/cli_pipe.c:948(cli_pipe_validate_current_pdu)
cli_pipe_validate_current_pdu: RPC fault code DCERPC fault
0x00000721 received from host ad1.casas.wsu.edu!
-------
log-wb-CASAS.old (during "crashed state"):
[2010/04/19 08:17:23, 1] libsmb/clikrb5.c:697(ads_krb5_mk_req)
ads_krb5_mk_req: krb5_get_credentials failed for a...@casas (Cannot
resolve network address
for KDC in requested realm)
[2010/04/19 08:17:23, 1] libsmb/cliconnect.c:745(cli_session_setup_kerberos)
cli_session_setup_kerberos: spnego_gen_negTokenTarg failed: Cannot
resolve network address f
or KDC in requested realm
[2010/04/19 08:17:23, 0] rpc_client/cli_pipe.c:687(cli_pipe_verify_schannel)
cli_pipe_verify_schannel: auth_len 56.
[2010/04/19 08:17:23, 1]
rpc_client/cli_pipe.c:948(cli_pipe_validate_current_pdu)
cli_pipe_validate_current_pdu: RPC fault code DCERPC fault
0x00000721 received from host ad1
.casas.wsu.edu!
------------
My configuration
------------
smb.conf
------------
[global]
security = ads
netbios name = casas-lin
realm = CASAS.WSU.EDU
workgroup = CASAS
password server = ad1.casas.wsu.edu
workgroup = CASAS
idmap uid = 10000-20000
idmap gid = 10000-20000
idmap backend = rid:CASAS.WSU.EDU=10000-20000
winbind enum users = yes
winbind enum groups = yes
winbind use default domain = yes
#template homedir = /home/%U
template homedir = /net/files/home/%U
template shell = /bin/bash
; client use spnego = yes
domain master = no
--------------
/etc/krb5.conf
-------------
[logging]
default =FILE:/var/log/krb5libs.log
kdc =FILE:/var/log/krb5kdc.log
admin_server =FILE:/var/log/kadmind.log
[libdefaults]
default_realm = CASAS.WSU.EDU
dns_lookup_realm = false
dns_lookup_kdc = true
ticket_lifetime = 24h
forwardable = yes
[realms]
EXAMPLE.COM = {
kdc = kerberos.example.com:88
admin_server = kerberos.example.com:749
default_domain = example.com
}
CASAS.WSU.EDU = {
kdc = ad1.casas.wsu.edu
admin_server = ad1.casas.wsu.edu
kdc = ad1.casas.wsu.edu
}
CASAS = {
kdc = ad1.casas.wsu.edu
admin_server = ad1.casas.wsu.edu
kdc = ad1.casas.wsu.edu
}
[domain_realm]
.example.com = EXAMPLE.COM
example.com = EXAMPLE.COM
casas.wsu.edu = CASAS.WSU.EDU
.casas.wsu.edu = CASAS.WSU.EDU
[appdefaults]
pam = {
debug = false
ticket_lifetime = 36000
renew_lifetime = 36000
forwardable = true
krb4_convert = false
}
---------------
/etc/pam.d/common-account
---------------
account [success=1 new_authtok_reqd=done default=ignore] pam_unix.so
account requisite pam_deny.so
account required pam_permit.so
account sufficient pam_winbind.so
account required pam_krb5.so minimum_uid=1000
------------
/etc/pam.d/common-auth
------------
auth [success=3 default=ignore] pam_winbind.so krb5_auth
krb5_ccache_type=FILE
auth [success=2 default=ignore] pam_krb5.so minimum_uid=1000
try_first_pass
auth [success=1 default=ignore] pam_unix.so nullok_secure try_first_pass
auth requisite pam_deny.so
auth required pam_permit.so
------------
/etc/pam.d/common-password
------------
password requisite pam_winbind.so
password requisite pam_krb5.so minimum_uid=1000
use_authtok
password [success=1 default=ignore] pam_unix.so obscure use_authtok
try_first_pass sha512
password requisite pam_deny.so
password required pam_permit.so
password optional pam_gnome_keyring.so
-------------
/etc/nsswitch.conf
-------------
passwd: compat winbind
group: compat winbind
shadow: compat
hosts: files dns mdns4
networks: files
protocols: db files
services: db files
ethers: db files
rpc: db files
netgroup: nis
----------------
Thanks!
--Jim
--
To unsubscribe from this list go to the following URL and read the
instructions: https://lists.samba.org/mailman/options/samba