Hi Friends, I am running samba on RHEL 4.4 64 bit server on HP Proliant AMD Opteron(tm) Processor 254 with 4GB RAM.There are about 2000 users who access samba shares for ex their home directories through ldap authentication. Most of the users are using Windows XP SP2 For last few days we are seeing some errors in the logs file and samba shares have become difficult to access.
fs1-3 smbd[4540]: [2007/12/27 15:00:32, 0] auth/auth_domain.c:domain_client_validate(199) fs1-3 smbd[4540]: domain_client_validate: unable to validate password for user ankush in domain testing to Domain controller \\DC. Error was NT_STATUS_WRONG_PASSWORD. Dec 27 15:28:49 fs1-3 smbd[17243]: Error writing 4 bytes to client. -1. (Connection reset by peer) Dec 27 15:28:49 fs1-3 smbd[17420]: write_socket: Error writing 322 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:49 fs1-3 smbd[17421]: write_socket: Error writing 322 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:49 fs1-3 smbd[17541]: [2007/12/27 15:28:49, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:49 fs1-3 smbd[17422]: write_socket: Error writing 322 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:49 fs1-3 smbd[17423]: write_socket: Error writing 322 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:49 fs1-3 smbd[17424]: write_socket: Error writing 322 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:49 fs1-3 smbd[17217]: [2007/12/27 15:28:49, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:49 fs1-3 smbd[17538]: [2007/12/27 15:28:49, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:49 fs1-3 smbd[17377]: tdb_chainlock_with_timeout_internal: alarm (10) timed out for key DC in tdb /etc/samba/secrets.tdb Dec 27 15:29:01 fs1-3 smbd[17563]: [2007/12/27 15:29:01, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:49 fs1-3 smbd[17207]: [2007/12/27 15:28:49, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:50 fs1-3 smbd[17220]: [2007/12/27 15:28:50, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:50 fs1-3 smbd[17380]: tdb_chainlock_with_timeout_internal: alarm (10) timed out for key DC in tdb /etc/samba/secrets.tdb Dec 27 15:28:50 fs1-3 smbd[17381]: tdb_chainlock_with_timeout_internal: alarm (10) timed out for key DC in tdb /etc/samba/secrets.tdb Dec 27 15:28:50 fs1-3 smbd[17222]: [2007/12/27 15:28:50, 0] lib/util_sock.c:send_smb(647) Dec 27 15:28:50 fs1-3 smbd[17383]: tdb_chainlock_with_timeout_internal: alarm (10) timed out for key DC in tdb /etc/samba/secrets.tdb Dec 27 15:28:50 fs1-3 smbd[17219]: [2007/12/27 15:28:50, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:50 fs1-3 smbd[17504]: getpeername failed. Error was Transport endpoint is not connected Dec 27 15:28:50 fs1-3 smbd[17384]: tdb_chainlock_with_timeout_internal: alarm (10) timed out for key DC in tdb /etc/samba/secrets.tdb Dec 27 15:28:50 fs1-3 smbd[17428]: [2007/12/27 15:28:50, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:50 fs1-3 smbd[17509]: getpeername failed. Error was Transport endpoint is not connected Dec 27 15:28:50 fs1-3 smbd[17448]: [2007/12/27 15:28:50, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:50 fs1-3 smbd[17386]: tdb_chainlock_with_timeout_internal: alarm (10) timed out for key DC in tdb /etc/samba/secrets.tdb Dec 27 15:28:50 fs1-3 smbd[17223]: [2007/12/27 15:28:50, 0] lib/util_sock.c:write_socket(455) Dec 27 15:29:01 fs1-3 smbd[17223]: write_socket: Error writing 122 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:29:01 fs1-3 smbd[17223]: [2007/12/27 15:29:01, 0] lib/util_sock.c:send_smb(647) Dec 27 15:28:50 fs1-3 smbd[17481]: [2007/12/27 15:28:50, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:50 fs1-3 smbd[17429]: [2007/12/27 15:28:50, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:50 fs1-3 smbd[17483]: [2007/12/27 15:28:50, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:50 fs1-3 smbd[17435]: write_socket: Error writing 171 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:50 fs1-3 smbd[17486]: [2007/12/27 15:28:50, 0] lib/util_sock.c:write_socket_data(430) Dec 27 15:28:50 fs1-3 smbd[17485]: [2007/12/27 15:28:50, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:50 fs1-3 smbd[17436]: write_socket: Error writing 171 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:50 fs1-3 smbd[17430]: [2007/12/27 15:28:50, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:50 fs1-3 smbd[17547]: [2007/12/27 15:28:50, 0] lib/util_sock.c:write_socket_data(430) Dec 27 15:28:50 fs1-3 smbd[17261]: write_socket_data: write failure. Error = Connection reset by peer Dec 27 15:28:51 fs1-3 smbd[17431]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17432]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17224]: [2007/12/27 15:28:51, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:51 fs1-3 smbd[17439]: getpeername failed. Error was Transport endpoint is not connected Dec 27 15:28:51 fs1-3 smbd[17434]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17437]: write_socket: Error writing 171 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:51 fs1-3 smbd[17438]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17238]: [2007/12/27 15:28:51, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:51 fs1-3 smbd[17558]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17148]: [2007/12/27 15:28:51, 0] lib/util_sock.c:send_smb(647) Dec 27 15:28:51 fs1-3 smbd[17265]: [2007/12/27 15:28:51, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:51 fs1-3 smbd[17275]: [2007/12/27 15:28:51, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:51 fs1-3 smbd[17449]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17366]: Error writing 322 bytes to client. -1. (Connection reset by peer) Dec 27 15:28:51 fs1-3 smbd[17450]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17264]: [2007/12/27 15:28:51, 0] lib/util_sock.c:write_socket_data(430) Dec 27 15:28:51 fs1-3 smbd[17452]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17511]: write_socket_data: write failure. Error = Connection reset by peer Dec 27 15:28:51 fs1-3 smbd[17512]: write_socket_data: write failure. Error = Connection reset by peer Dec 27 15:28:51 fs1-3 smbd[17513]: write_socket_data: write failure. Error = Connection reset by peer Dec 27 15:28:51 fs1-3 smbd[17514]: write_socket_data: write failure. Error = Connection reset by peer Dec 27 15:28:51 fs1-3 smbd[17516]: write_socket_data: write failure. Error = Connection reset by peer Dec 27 15:28:51 fs1-3 smbd[17451]: [2007/12/27 15:28:51, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:51 fs1-3 smbd[17149]: [2007/12/27 15:28:51, 0] lib/util_sock.c:write_socket(455) Dec 27 15:28:51 fs1-3 smbd[4292]: read_socket_data: recv failure for 4. Error = Connection timed out Dec 27 15:28:51 fs1-3 smbd[17160]: Error writing 122 bytes to client. -1. (Connection reset by peer) Dec 27 15:28:52 fs1-3 smbd[17453]: [2007/12/27 15:28:52, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:52 fs1-3 smbd[17517]: getpeername failed. Error was Transport endpoint is not connected Dec 27 15:28:52 fs1-3 smbd[17454]: [2007/12/27 15:28:52, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:52 fs1-3 smbd[17365]: Error writing 171 bytes to client. -1. (Connection reset by peer) Dec 27 15:28:52 fs1-3 smbd[17367]: Error writing 171 bytes to client. -1. (Connection reset by peer) Dec 27 15:28:52 fs1-3 smbd[17368]: write_socket: Error writing 4 bytes to socket 5: ERRNO = Connection reset by peer Dec 27 15:28:52 fs1-3 smbd[17458]: [2007/12/27 15:28:52, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:52 fs1-3 smbd[17470]: [2007/12/27 15:28:52, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:52 fs1-3 smbd[17271]: write_socket_data: write failure. Error = Connection reset by peer Dec 27 15:28:52 fs1-3 smbd[17463]: [2007/12/27 15:28:52, 0] lib/util_sock.c:get_peer_addr(1000) Dec 27 15:28:52 fs1-3 smbd[17456]: [2007/12/27 15:28:52, 0] lib/util_sock.c:get_peer_addr(1000) Searching on the google I found the some articles suggesting to block 445 through iptables and putting smb ports =139 in smb.conf. I have done both of the things but still getting these errors. Most of the time there are more than 5000 smb connections/processes on the server and load is also high. Even I kill these connections are restart samba within few mins I can see more than 3000 samba connections and there are not more than 2000 users accessing samba server at the same time. ps -efm | grep smb | wc -l 6827 samba configuration # testparm Load smb config files from /etc/samba/smb.conf Processing section "[homes]" Loaded services file OK. Server role: ROLE_DOMAIN_MEMBER Press enter to see a dump of your service definitions # Global parameters [global] workgroup = testing realm = testing.com server string = Home Folders security = ADS password server = dc.testing.com smb ports = 139 socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 SO_KEEPALIVE load printers = No printcap name = /etc/printcap dns proxy = No idmap uid = 16777216-33554431 idmap gid = 16777216-33554431 cups options = raw [homes] comment = Home Directories read only = No browseable = No samba rpms installed on the server samba-3.0.10-1.4E.9 samba-common-3.0.10-1.4E.9 system-config-samba-1.2.21-1 samba-common-3.0.10-1.4E.9 samba-client-3.0.10-1.4E.9 Kindly suggest a way to get rid of these errors. Thanks & Regards Ankush -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/listinfo/samba