Re: [Samba] mutex.tdb locking errors on Solaris 10

2012-05-09 Thread S.Kirk

Hi,

On Mon, 30 Apr 2012, Kirk S. wrote:


Morning list,

--snip --

We have not tried the version of Samba currenlty shipping with Solaris 
- we've found a lot of problems with it in the past integrating with 
AD. As it currently stands, single sign-on works but there is a CNAME 
that the clients are pointing to which doesn't seem optimal but does 
work. I take it you're using the version shipped with Solaris - update 
10 to 3.5.10 I guess?


--snip--

In a further update to this problem, I have migrated the CNAME to 
another server that we have standing in for the 'faulty' instance. This 
server was previously working fairly well (although SMF had restarted 
samba at least once due to the parent smbd process crashing but no core 
file was produced) and was not generating large numbers of error message 
regarding locking mutex.tdb.


Now I have migrated the CNAME, we have huge numbers of errors on the 
temporary server.


A minor change of deleting the CNAME and replacing it with an A 
record that points at the same IP as the server uses as it's primary 
interface (wthout removing the existing A record for hostname or 
modifying the PTR record) seems to have resolved the huge numbers 
of errors locking mutex.tdb and the associated performance problems we 
were experiencing.


It doesn't sound like anyone else is experiencing this so I assume the 
only thing to do is to file a bug report? I've not tried 3.6.5 yet but I 
see it has been packaged by OpenCSW a few days ago so I may be able to try 
it on a test machine.


Cheers,
Steve

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] mutex.tdb locking errors on Solaris 10

2012-05-09 Thread Jeremy Allison
On Wed, May 09, 2012 at 03:18:02PM +0100, S.Kirk wrote:
 Hi,
 
 On Mon, 30 Apr 2012, Kirk S. wrote:
 
 Morning list,
 
 --snip --
 
 We have not tried the version of Samba currenlty shipping with
 Solaris - we've found a lot of problems with it in the past
 integrating with AD. As it currently stands, single sign-on
 works but there is a CNAME that the clients are pointing to
 which doesn't seem optimal but does work. I take it you're using
 the version shipped with Solaris - update 10 to 3.5.10 I guess?
 
 --snip--
 
 In a further update to this problem, I have migrated the CNAME to
 another server that we have standing in for the 'faulty' instance.
 This server was previously working fairly well (although SMF had
 restarted samba at least once due to the parent smbd process
 crashing but no core file was produced) and was not generating
 large numbers of error message regarding locking mutex.tdb.
 
 Now I have migrated the CNAME, we have huge numbers of errors on
 the temporary server.
 
 A minor change of deleting the CNAME and replacing it with an A
 record that points at the same IP as the server uses as it's primary
 interface (wthout removing the existing A record for hostname or
 modifying the PTR record) seems to have resolved the huge numbers of
 errors locking mutex.tdb and the associated performance problems we
 were experiencing.
 
 It doesn't sound like anyone else is experiencing this so I assume
 the only thing to do is to file a bug report? I've not tried 3.6.5
 yet but I see it has been packaged by OpenCSW a few days ago so I
 may be able to try it on a test machine.

Yes, that would be very helpful - thanks !

Jeremy.
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] mutex.tdb locking errors on Solaris 10

2012-04-30 Thread Kirk S.
Morning list,

--snip --

 We have not tried the version of Samba currenlty shipping with Solaris - 
 we've found a lot of problems with it in the past integrating with AD. As it 
 currently stands, single sign-on works but there is a CNAME that the clients 
 are pointing to which doesn't seem optimal but does work. I take it you're 
 using the version shipped with Solaris - update 10 to 3.5.10 I guess?

--snip--

In a further update to this problem, I have migrated the CNAME to another 
server that we have standing in for the 'faulty' instance. This server was 
previously working fairly well (although SMF had restarted samba at least once 
due to the parent smbd process crashing but no core file was produced) and was 
not generating large numbers of error message regarding locking mutex.tdb.

Now I have migrated the CNAME, we have huge numbers of errors on the temporary 
server. I suspect some sort of Kerberos error related to the CNAME record - 
previously all clients using the 'good' server were using the real hostname of 
the box. Why this appears to cause some sort of locking error on mutex.tdb, I 
have no idea currently.

Does anyone have any further suggestions on this? Or does anyone know what 
mutex.tdb is actually used for? All of the documentation I've seen leaves its 
usage blank.
 
Cheers,
Steve
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


[Samba] mutex.tdb locking errors on Solaris 10

2012-04-26 Thread Kirk S.
Hi,

We are experiencing a problem with Samba 3.6.4 on Solaris 10 update 10. This 
problem has only recently started since an upgrade to v3.6.3 and  was still 
present after rebuilding to 3.6.4. We are using the version of samba packaged 
by OpenCSW.

From a client perspective, the issue is manifested as intermittent very poor 
performance or intermittent inability to save a file to the share at all. From 
the server side, it appears that when this occurs it ties in with the 
following log message:

[2012/04/26 10:55:07.283496,  1] ../lib/util/tdb_wrap.c:65(tdb_wrap_log)
  tdb(/var/opt/csw/samba/locks/mutex.tdb): tdb_lock failed on list 2 ltype=2 
(Interrupted system call)
[2012/04/26 10:55:07.283893,  0] 
lib/util_tdb.c:72(tdb_chainlock_with_timeout_internal)
  tdb_chainlock_with_timeout_internal: alarm (10) timed out for key replay 
cache mutex in tdb /var/opt/csw/samba/locks/mutex.tdb
[2012/04/26 10:55:07.284235,  1] lib/server_mutex.c:74(grab_named_mutex)
  Could not get the lock for replay cache mutex
[2012/04/26 10:55:07.284611,  1] libads/kerberos_verify.c:560(ads_verify_ticket)
  libads/kerberos_verify.c:559: unable to protect replay cache with mutex.
[2012/04/26 10:55:07.284978,  1] smbd/sesssetup.c:342(reply_spnego_kerberos)
  Failed to verify incoming ticket with error NT_STATUS_LOGON_FAILURE!
[2012/04/26 10:55:07.285300,  3] smbd/error.c:81(error_packet_set)
  error packet at smbd/sesssetup.c(344) cmd=115 (SMBsesssetupX) 
NT_STATUS_LOGON_FAILURE

It's not clear what the mutex.tdb file actually does or contains, all of the 
documentation I've found does not list what it's used for but there is clearly 
a problem obtaining a lock on this file that was not present on Samba v3.4.7 on 
the same platform. We did, however, have to patch the server in order to 
support the Samba package we are using. This error does not appear to be 
something obvious such as number of open files on the operating system that is 
causing this and running tdbtool against this particular file produces a 
similar problem obtaining a lock on the file.

The server that is experiencing this problem is sun4v architecture and has it's 
storage mounted via NFS from another, central file server. We are running the 
same samba package on the central server which is sun4u, on the same build of 
Solaris with the same patch cluster and don't see this error or a performance 
problem. This central server also has more users connected.

I have the log level set to 10 on the problematic machine currently so can 
supply additional log details if required. If there are any suggests on what 
may be causing this issue and how to resolve, that would be great.

Thanks,
Steve

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] mutex.tdb locking errors on Solaris 10

2012-04-26 Thread Gaiseric Vandal
Is this machine a member server or DC?  THis looks like an
authentication issue.You could try enabling the solaris nscd (name
service caching daemon) on member servers to help with flaky
authentication connections to a DC.

Do you have the same problem with non-NFS mounted directories?   Are you
using autofs mounts?I found that resharing NFS mounts in samba
became more trouble than it was worth although it definitely has uses
when you want have directories from 2 or more servers appear in a single
logical windows directory.   The alternative is to define Windows level
symlinks.

Have you tried the version of samba bundled with Solaris 10 (Samba
version 3.5.10.)I had a lot of aggravation either compiling samba
from source on solaris or using the Sunfreeware or Blastwave versions
(e.g. ZFS support, 64-bit support, proper integration with Solaris LDAP
and Kerberos, winbind support.)





On 04/26/12 09:31, Kirk S. wrote:
 Hi,

 We are experiencing a problem with Samba 3.6.4 on Solaris 10 update 10. This 
 problem has only recently started since an upgrade to v3.6.3 and  was still 
 present after rebuilding to 3.6.4. We are using the version of samba packaged 
 by OpenCSW.

 From a client perspective, the issue is manifested as intermittent very poor 
 performance or intermittent inability to save a file to the share at all. 
 From the server side, it appears that when this occurs it ties in with the 
 following log message:

 [2012/04/26 10:55:07.283496,  1] ../lib/util/tdb_wrap.c:65(tdb_wrap_log)
   tdb(/var/opt/csw/samba/locks/mutex.tdb): tdb_lock failed on list 2 ltype=2 
 (Interrupted system call)
 [2012/04/26 10:55:07.283893,  0] 
 lib/util_tdb.c:72(tdb_chainlock_with_timeout_internal)
   tdb_chainlock_with_timeout_internal: alarm (10) timed out for key replay 
 cache mutex in tdb /var/opt/csw/samba/locks/mutex.tdb
 [2012/04/26 10:55:07.284235,  1] lib/server_mutex.c:74(grab_named_mutex)
   Could not get the lock for replay cache mutex
 [2012/04/26 10:55:07.284611,  1] 
 libads/kerberos_verify.c:560(ads_verify_ticket)
   libads/kerberos_verify.c:559: unable to protect replay cache with mutex.
 [2012/04/26 10:55:07.284978,  1] smbd/sesssetup.c:342(reply_spnego_kerberos)
   Failed to verify incoming ticket with error NT_STATUS_LOGON_FAILURE!
 [2012/04/26 10:55:07.285300,  3] smbd/error.c:81(error_packet_set)
   error packet at smbd/sesssetup.c(344) cmd=115 (SMBsesssetupX) 
 NT_STATUS_LOGON_FAILURE

 It's not clear what the mutex.tdb file actually does or contains, all of the 
 documentation I've found does not list what it's used for but there is 
 clearly a problem obtaining a lock on this file that was not present on Samba 
 v3.4.7 on the same platform. We did, however, have to patch the server in 
 order to support the Samba package we are using. This error does not appear 
 to be something obvious such as number of open files on the operating system 
 that is causing this and running tdbtool against this particular file 
 produces a similar problem obtaining a lock on the file.

 The server that is experiencing this problem is sun4v architecture and has 
 it's storage mounted via NFS from another, central file server. We are 
 running the same samba package on the central server which is sun4u, on the 
 same build of Solaris with the same patch cluster and don't see this error or 
 a performance problem. This central server also has more users connected.

 I have the log level set to 10 on the problematic machine currently so can 
 supply additional log details if required. If there are any suggests on what 
 may be causing this issue and how to resolve, that would be great.

 Thanks,
 Steve


-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] mutex.tdb locking errors on Solaris 10

2012-04-26 Thread Kirk S.
Hi,

This is a member server. We also already have nscd running on our Solaris 
systems.

We are using autofs to mount the filesytems. The main reason this was 
implemented originally is that we had a performance problem when all of these 
clients were served from a single a machine. I'm not 100% sure that we 
identified the problem correctly however it appeared to be a problem scaling 
samba beyond a certain number of concurrent connections or smbd processes hence 
an extra machine to serve samba connections.

We have not tried the version of Samba currenlty shipping with Solaris - we've 
found a lot of problems with it in the past integrating with AD. As it 
currently stands, single sign-on works but there is a CNAME that the clients 
are pointing to which doesn't seem optimal but does work. I take it you're 
using the version shipped with Solaris - update 10 to 3.5.10 I guess?

Thanks,
Steve

-Original Message-
From: samba-boun...@lists.samba.org [mailto:samba-boun...@lists.samba.org] On 
Behalf Of Gaiseric Vandal
Sent: 26 April 2012 15:03
To: samba@lists.samba.org
Subject: Re: [Samba] mutex.tdb locking errors on Solaris 10

Is this machine a member server or DC?  THis looks like an
authentication issue.You could try enabling the solaris nscd (name
service caching daemon) on member servers to help with flaky
authentication connections to a DC.

Do you have the same problem with non-NFS mounted directories?   Are you
using autofs mounts?I found that resharing NFS mounts in samba
became more trouble than it was worth although it definitely has uses
when you want have directories from 2 or more servers appear in a single
logical windows directory.   The alternative is to define Windows level
symlinks.

Have you tried the version of samba bundled with Solaris 10 (Samba
version 3.5.10.)I had a lot of aggravation either compiling samba
from source on solaris or using the Sunfreeware or Blastwave versions
(e.g. ZFS support, 64-bit support, proper integration with Solaris LDAP
and Kerberos, winbind support.)





On 04/26/12 09:31, Kirk S. wrote:
 Hi,

 We are experiencing a problem with Samba 3.6.4 on Solaris 10 update 10. This 
 problem has only recently started since an upgrade to v3.6.3 and  was still 
 present after rebuilding to 3.6.4. We are using the version of samba packaged 
 by OpenCSW.

 From a client perspective, the issue is manifested as intermittent very poor 
 performance or intermittent inability to save a file to the share at all. 
 From the server side, it appears that when this occurs it ties in with the 
 following log message:

 [2012/04/26 10:55:07.283496,  1] ../lib/util/tdb_wrap.c:65(tdb_wrap_log)
   tdb(/var/opt/csw/samba/locks/mutex.tdb): tdb_lock failed on list 2 ltype=2 
 (Interrupted system call)
 [2012/04/26 10:55:07.283893,  0] 
 lib/util_tdb.c:72(tdb_chainlock_with_timeout_internal)
   tdb_chainlock_with_timeout_internal: alarm (10) timed out for key replay 
 cache mutex in tdb /var/opt/csw/samba/locks/mutex.tdb
 [2012/04/26 10:55:07.284235,  1] lib/server_mutex.c:74(grab_named_mutex)
   Could not get the lock for replay cache mutex
 [2012/04/26 10:55:07.284611,  1] 
 libads/kerberos_verify.c:560(ads_verify_ticket)
   libads/kerberos_verify.c:559: unable to protect replay cache with mutex.
 [2012/04/26 10:55:07.284978,  1] smbd/sesssetup.c:342(reply_spnego_kerberos)
   Failed to verify incoming ticket with error NT_STATUS_LOGON_FAILURE!
 [2012/04/26 10:55:07.285300,  3] smbd/error.c:81(error_packet_set)
   error packet at smbd/sesssetup.c(344) cmd=115 (SMBsesssetupX) 
 NT_STATUS_LOGON_FAILURE

 It's not clear what the mutex.tdb file actually does or contains, all of the 
 documentation I've found does not list what it's used for but there is 
 clearly a problem obtaining a lock on this file that was not present on Samba 
 v3.4.7 on the same platform. We did, however, have to patch the server in 
 order to support the Samba package we are using. This error does not appear 
 to be something obvious such as number of open files on the operating system 
 that is causing this and running tdbtool against this particular file 
 produces a similar problem obtaining a lock on the file.

 The server that is experiencing this problem is sun4v architecture and has 
 it's storage mounted via NFS from another, central file server. We are 
 running the same samba package on the central server which is sun4u, on the 
 same build of Solaris with the same patch cluster and don't see this error or 
 a performance problem. This central server also has more users connected.

 I have the log level set to 10 on the problematic machine currently so can 
 supply additional log details if required. If there are any suggests on what 
 may be causing this issue and how to resolve, that would be great.

 Thanks,
 Steve


-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba
-- 
To unsubscribe from this list

Re: [Samba] mutex.tdb locking errors on Solaris 10

2012-04-26 Thread Gaiseric Vandal
I am using Samba 3.5.10 bundled with Solaris 10 (with one of the more
recent patch clusters.)I have a samba PDC so I can't speak to AD
issues.  I have never tried joining samba to an AD domain.  

autofs mounts mount on demand, but they also unmount when inactive. 
This, I find, can make samba shares based on autofs shares slow to
respond-  especially if there are a lot of autofs directories. But
then again  autofs is less likely to cause the NFS client machine to
hang up if the server machine is off line.  




On 04/26/12 10:49, Kirk S. wrote:
 Hi,

 This is a member server. We also already have nscd running on our Solaris 
 systems.

 We are using autofs to mount the filesytems. The main reason this was 
 implemented originally is that we had a performance problem when all of these 
 clients were served from a single a machine. I'm not 100% sure that we 
 identified the problem correctly however it appeared to be a problem scaling 
 samba beyond a certain number of concurrent connections or smbd processes 
 hence an extra machine to serve samba connections.

 We have not tried the version of Samba currenlty shipping with Solaris - 
 we've found a lot of problems with it in the past integrating with AD. As it 
 currently stands, single sign-on works but there is a CNAME that the clients 
 are pointing to which doesn't seem optimal but does work. I take it you're 
 using the version shipped with Solaris - update 10 to 3.5.10 I guess?

 Thanks,
 Steve

 -Original Message-
 From: samba-boun...@lists.samba.org [mailto:samba-boun...@lists.samba.org] On 
 Behalf Of Gaiseric Vandal
 Sent: 26 April 2012 15:03
 To: samba@lists.samba.org
 Subject: Re: [Samba] mutex.tdb locking errors on Solaris 10

 Is this machine a member server or DC?  THis looks like an
 authentication issue.You could try enabling the solaris nscd (name
 service caching daemon) on member servers to help with flaky
 authentication connections to a DC.

 Do you have the same problem with non-NFS mounted directories?   Are you
 using autofs mounts?I found that resharing NFS mounts in samba
 became more trouble than it was worth although it definitely has uses
 when you want have directories from 2 or more servers appear in a single
 logical windows directory.   The alternative is to define Windows level
 symlinks.

 Have you tried the version of samba bundled with Solaris 10 (Samba
 version 3.5.10.)I had a lot of aggravation either compiling samba
 from source on solaris or using the Sunfreeware or Blastwave versions
 (e.g. ZFS support, 64-bit support, proper integration with Solaris LDAP
 and Kerberos, winbind support.)





 On 04/26/12 09:31, Kirk S. wrote:
 Hi,

 We are experiencing a problem with Samba 3.6.4 on Solaris 10 update 10. This 
 problem has only recently started since an upgrade to v3.6.3 and  was still 
 present after rebuilding to 3.6.4. We are using the version of samba 
 packaged by OpenCSW.

 From a client perspective, the issue is manifested as intermittent very poor 
 performance or intermittent inability to save a file to the share at all. 
 From the server side, it appears that when this occurs it ties in with the 
 following log message:

 [2012/04/26 10:55:07.283496,  1] ../lib/util/tdb_wrap.c:65(tdb_wrap_log)
   tdb(/var/opt/csw/samba/locks/mutex.tdb): tdb_lock failed on list 2 ltype=2 
 (Interrupted system call)
 [2012/04/26 10:55:07.283893,  0] 
 lib/util_tdb.c:72(tdb_chainlock_with_timeout_internal)
   tdb_chainlock_with_timeout_internal: alarm (10) timed out for key replay 
 cache mutex in tdb /var/opt/csw/samba/locks/mutex.tdb
 [2012/04/26 10:55:07.284235,  1] lib/server_mutex.c:74(grab_named_mutex)
   Could not get the lock for replay cache mutex
 [2012/04/26 10:55:07.284611,  1] 
 libads/kerberos_verify.c:560(ads_verify_ticket)
   libads/kerberos_verify.c:559: unable to protect replay cache with mutex.
 [2012/04/26 10:55:07.284978,  1] smbd/sesssetup.c:342(reply_spnego_kerberos)
   Failed to verify incoming ticket with error NT_STATUS_LOGON_FAILURE!
 [2012/04/26 10:55:07.285300,  3] smbd/error.c:81(error_packet_set)
   error packet at smbd/sesssetup.c(344) cmd=115 (SMBsesssetupX) 
 NT_STATUS_LOGON_FAILURE

 It's not clear what the mutex.tdb file actually does or contains, all of the 
 documentation I've found does not list what it's used for but there is 
 clearly a problem obtaining a lock on this file that was not present on 
 Samba v3.4.7 on the same platform. We did, however, have to patch the server 
 in order to support the Samba package we are using. This error does not 
 appear to be something obvious such as number of open files on the operating 
 system that is causing this and running tdbtool against this particular file 
 produces a similar problem obtaining a lock on the file.

 The server that is experiencing this problem is sun4v architecture and has 
 it's storage mounted via NFS from another, central file server. We are 
 running the same samba package on the central