Hi,
I having intermittent problems which trigger the below segfault. Would love to know what the problem is, and will gladly take any suggestions as to how to troubleshoot this. When this segfault happen, the windows clients sometimes report "write delay failed".

This machine is an OpenVZ virtual machine, with a bridged interface.
Samba version: 3.0.22-1ubuntu3.6 (ubuntu dapper with custom OpenVZ kernel)
kernel: 2.6.18-028stab053
Filesystem is xfs with noatime on.
mount output:
simfs on / type simfs (rw,noatime)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw)
tmpfs on /dev/shm type tmpfs (rw)
varrun on /var/run type tmpfs (rw)
varlock on /var/lock type tmpfs (rw)

xfs filesystem is on top of a Raid1 md device on the host, with the members provided by AoE.

I don't think the segfaults correspond to the occasional AoE retransmits we get. All networking is gigabit, and this system was stress tested with dbench before going into production, and worked pretty well. Host has plenty of ram, and nothing else segfaults, so I don't think its a hardware problem, though I wouldn't rule out an interaction problem with Samba/XFS/OpenVZ/md-RAID1/AoE at this stage.

As you can see it points at a windows server for authentication. I have recently increased the winbind cache time to 900, in an effort to fix, but to no avail. Across about 150 client machines getting about 10 segfaults a day, each with this error. No particular pattern to the segfaults, and nothing sticks out about the users/uid etc of the users affected (seems to be everyone and anyone at random).

Setting loglevel to 10 for a specific machine doesn't seem to provide any extra information for this problem.

getting this segfault which triggers a panic action intermittently:

Using host libthread_db library "/lib/libthread_db.so.1".
[Thread debugging using libthread_db enabled]
[New Thread 47979553320704 (LWP 7785)]
0x00002ba31bd930c4 in waitpid () from /lib/libc.so.6
#0  0x00002ba31bd930c4 in waitpid () from /lib/libc.so.6
#1  0x00002ba31bd3d5ff in strtold_l () from /lib/libc.so.6
#2 0x00000000005c1374 in smb_panic2 (why=0x6ac813 "failed to set uid\n", decrement_pid_count=<value optimized out>) at lib/util.c:1545
#3  0x00000000005c60d8 in assert_uid (ruid=4294967295, euid=10122)
   at lib/util_sec.c:96
#4  0x000000000049235e in become_id (uid=10122, gid=10000) at smbd/sec_ctx.c:60
#5  0x0000000000492c8c in pop_sec_ctx () at smbd/sec_ctx.c:375
#6  0x000000000048a579 in unbecome_root () at smbd/uid.c:435
#7  0x00000000005eeaf1 in reply_to_oplock_break_requests (fsp=0x8de810)
   at smbd/oplock.c:683
#8  0x0000000000490601 in close_file (fsp=0x8de810, normal_close=1)
   at smbd/close.c:228
#9 0x0000000000471ba7 in reply_close (conn=0x8d8390, inbuf=0x2ba31a5a3010 "", outbuf=0x2ba31a5c4010 "", size=<value optimized out>, dum_buffsize=<value optimized out>) at smbd/reply.c:3286 #10 0x00000000004a111a in switch_message (type=4, inbuf=0x2ba31a5a3010 "", outbuf=0x2ba31a5c4010 "", size=45, bufsize=131072) at smbd/process.c:1071 #11 0x00000000004a15f0 in process_smb (inbuf=0x2ba31a5a3010 "", outbuf=0x2ba31a5c4010 "") at smbd/process.c:1101
#12 0x00000000004a24a4 in smbd_process () at smbd/process.c:1753
#13 0x0000000000645a39 in main (argc=23, argv=0x7fff905b09a0)
   at smbd/server.c:976


I have tried to reproduce the problem by stressing the system further, but to no avail. It does not appear related to system load etc, but happens at random.

The machine is a profile server for Firefox and Thunderbird, so lots of small files are opening and closing etc.

Here is my testparm -v: Domain replaced with (XXX)

[EMAIL PROTECTED]:/var/log/samba # testparm -v
Load smb config files from /etc/samba/smb.conf
Can't find include file /etc/samba/smb.conf.
Processing section "[homes]"
Processing section "[public]"
Processing section "[win]"
Processing section "[vmware]"
Processing section "[gap_backup]"
Processing section "[iso]"
Processing section "[pictures]"
Loaded services file OK.
WARNING: passdb expand explicit = yes is deprecated
'winbind separator = +' might cause problems with group membership.
Server role: ROLE_DOMAIN_MEMBER
Press enter to see a dump of your service definitions

[global]
       dos charset = CP850
       unix charset = UTF-8
       display charset = LOCALE
       workgroup = XXX
       realm = XXX.XXX.COM.AU
       netbios name = MRFORGETFUL
       netbios aliases =
       netbios scope =
       server string =
       interfaces =
       bind interfaces only = No
       security = ADS
       auth methods =
       encrypt passwords = Yes
       update encrypted = No
       client schannel = Auto
       server schannel = Auto
       allow trusted domains = Yes
       hosts equiv =
       map to guest = Never
       null passwords = No
       obey pam restrictions = Yes
       password server = xxxserver
       smb passwd file = /etc/samba/smbpasswd
       private dir = /etc/samba
       passdb backend = smbpasswd
       algorithmic rid base = 10000
       root directory =
       guest account = nobody
       enable privileges = No
       pam password change = No
       passwd program = /usr/bin/passwd %u
passwd chat = *Enter\snew\sUNIX\spassword:* %n\n *Retype\snew\sUNIX\spassword:* %n\n .
       passwd chat debug = No
       passwd chat timeout = 2
       check password script =
       username map =
       password level = 0
       username level = 0
       unix password sync = No
       restrict anonymous = 0
       lanman auth = Yes
       ntlm auth = Yes
       client NTLMv2 auth = No
       client lanman auth = Yes
       client plaintext auth = Yes
       preload modules =
       use kerberos keytab = No
       log level = 0
       syslog = 0
       syslog only = No
       log file = /var/log/samba/log.%m
       max log size = 1000
       debug timestamp = Yes
       debug hires timestamp = No
       debug pid = No
       debug uid = No
       smb ports = 445 139
       large readwrite = Yes
       max protocol = NT1
       min protocol = CORE
       read bmpx = No
       read raw = Yes
       write raw = Yes
       disable netbios = No
       reset on zero vc = No
       acl compatibility =
       defer sharing violations = Yes
       nt pipe support = Yes
       nt status support = Yes
       announce version = 4.9
       announce as = NT
       max mux = 50
       max xmit = 16644
       name resolve order = lmhosts wins host bcast
       max ttl = 259200
       max wins ttl = 518400
       min wins ttl = 21600
       time server = No
       unix extensions = Yes
       use spnego = Yes
       client signing = No
       server signing = No
       client use spnego = No
       enable asu support = Yes
       svcctl list =
       change notify timeout = 60
       deadtime = 0
       getwd cache = Yes
       keepalive = 300
       kernel change notify = Yes
       lpq cache time = 30
       max smbd processes = 0
       paranoid server security = Yes
       max disk size = 0
       max open files = 10000
       socket options = TCP_NODELAY
       use mmap = Yes
       hostname lookups = No
       name cache timeout = 660
       load printers = No
       printcap cache time = 750
       printcap name = /dev/null
       cups server =
       iprint server =
       disable spoolss = Yes
       enumports command =
       addprinter command =
       deleteprinter command =
       show add printer wizard = Yes
       os2 driver map =
       mangling method = hash2
       mangle prefix = 1
       max stat cache size = 0
       stat cache = Yes
       machine password timeout = 604800
       add user script =
       rename user script =
       delete user script =
       add group script =
       delete group script =
       add user to group script =
       delete user from group script =
       set primary group script =
       add machine script =
       shutdown script =
       abort shutdown script =
       username map script =
       logon script =
       logon path = \\%N\%U\profile
       logon drive =
       logon home = \\%N\%U
       domain logons = No
       os level = 20
       lm announce = Auto
       lm interval = 60
       preferred master = Auto
       local master = Yes
       domain master = Auto
       browse list = Yes
       enhanced browsing = Yes
       dns proxy = No
       wins proxy = No
       wins server =
       wins support = No
       wins hook =
       wins partners =
       kernel oplocks = Yes
       lock spin count = 3
       lock spin time = 10
       oplock break wait time = 0
       ldap admin dn =
       ldap delete dn = No
       ldap group suffix =
       ldap idmap suffix =
       ldap machine suffix =
       ldap passwd sync = no
       ldap replication sleep = 1000
       ldap suffix =
       ldap ssl =
       ldap timeout = 15
       ldap page size = 1024
       ldap user suffix =
       add share command =
       change share command =
       delete share command =
       eventlog list =
       config file =
       preload =
       lock directory =
       pid directory = /var/run/samba
       utmp directory =
       wtmp directory =
       utmp = No
       default service =
       message command =
       get quota command =
       set quota command =
       remote announce =
       remote browse sync =
       socket address = 0.0.0.0
       homedir map = auto.home
       afs username map =
       afs token lifetime = 604800
       log nt token command =
       time offset = 0
       NIS homedir = No
       panic action = /usr/share/samba/panic-action %d
       host msdfs = No
       enable rid algorithm = Yes
       passdb expand explicit = Yes
       idmap backend =
       idmap uid = 10000-20000
       idmap gid = 10000-20000
       template homedir = /home/%D/%U
       template shell = /bin/false
       winbind separator = +
       winbind cache time = 900
       winbind enum users = Yes
       winbind enum groups = Yes
       winbind use default domain = No
       winbind trusted domains only = No
       winbind nested groups = No
       winbind max idle children = 3
       winbind nss info = template
       comment =
       path =
       username =
       invalid users = root
       valid users =
       admin users =
       read list =
       write list =
       printer admin =
       force user =
       force group =
       read only = Yes
       acl check permissions = Yes
       acl group control = No
       acl map full control = Yes
       create mask = 0744
       force create mode = 00
       security mask = 0777
       force security mode = 00
       directory mask = 0755
       force directory mode = 00
       directory security mask = 0777
       force directory security mode = 00
       force unknown acl user = No
       inherit permissions = No
       inherit acls = No
       inherit owner = No
       guest only = No
       guest ok = No
       only user = No
       hosts allow =
       hosts deny =
       allocation roundup size = 1048576
       aio read size = 0
       aio write size = 0
       aio write behind =
       ea support = No
       nt acl support = Yes
       profile acls = No
       map acl inherit = No
       afs share = No
       block size = 1024
       max connections = 0
       min print space = 0
       strict allocate = No
       strict sync = No
       sync always = No
       use sendfile = No
       write cache size = 0
       max reported print jobs = 0
       max print jobs = 1000
       printable = No
       printing = bsd
       cups options =
       print command = lpr -r -P'%p' %s
       lpq command = lpq -P'%p'
       lprm command = lprm -P'%p' %j
       lppause command =
       lpresume command =
       queuepause command =
       queueresume command =
       printer name =
       use client driver = No
       default devmode = No
       force printername = No
       default case = lower
       case sensitive = Auto
       preserve case = Yes
       short preserve case = Yes
       mangling char = ~
       hide dot files = Yes
       hide special files = No
       hide unreadable = No
       hide unwriteable files = No
       delete veto files = No
       veto files =
       hide files =
       veto oplock files =
       map archive = Yes
       map hidden = No
       map system = No
       map readonly = yes
       mangled names = Yes
       mangled map =
       store dos attributes = No
       browseable = Yes
       blocking locks = Yes
       csc policy = manual
       fake oplocks = No
       locking = Yes
       oplocks = Yes
       level2 oplocks = Yes
       oplock contention limit = 2
       posix locking = Yes
       strict locking = Yes
       share modes = Yes
       dfree cache time = 0
       dfree command =
       copy =
       include = /etc/samba/smb.conf.
       preexec =
       preexec close = No
       postexec =
       root preexec =
       root preexec close = No
       root postexec =
       available = Yes
       volume =
       fstype = NTFS
       set directory = No
       wide links = Yes
       follow symlinks = Yes
       dont descend =
       magic script =
       magic output =
       delete readonly = No
       dos filemode = No
       dos filetimes = Yes
       dos filetime resolution = No
       fake directory create times = No
       vfs objects =
       msdfs root = No
       msdfs proxy =

[homes]
       comment = Home Directories
       valid users = XXX+%S
       read only = No
       create mask = 0700
       directory mask = 0700
       browseable = No

[public]
       path = /mnt/pkg/public
       read only = No

[win]
       path = /mnt/pkg/win

[vmware]
       path = /mnt/pkg/vmware

[gap_backup]
       path = /mnt/pkg/gap_backup
       read only = No
       create mask = 0777
       directory mask = 0777

[iso]
       path = /mnt/pkg/iso

[pictures]
       path = /mnt/pkg/pictures

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba

Reply via email to