Re: [Samba] Overloaded samba server. Is it a bug? (but not a samba bug)

2005-11-10 Thread Jeremy Allison
On Fri, Nov 11, 2005 at 12:07:38AM -0300, Martin Scandroli wrote:
> Well. Finally we resolve it.
> The problem was with the QLA driver, we applied a kernel patch
> (kernel-bigsmp-2.6.5-7.234.i586.rpm) provided by SuSE support and it is
> working fine.
> The patch will be provided soon in next SLES9 Support Pack 3. 
> 
> Anyway, thanks all of you for your help!

No problem, I'm really glad you tracked it down and fixed it
(and it wasn't in Samba :-).

Jeremy.
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug? (but not a samba bug)

2005-11-10 Thread Martin Scandroli
Well. Finally we resolve it.
The problem was with the QLA driver, we applied a kernel patch
(kernel-bigsmp-2.6.5-7.234.i586.rpm) provided by SuSE support and it is
working fine.
The patch will be provided soon in next SLES9 Support Pack 3. 

Anyway, thanks all of you for your help!
Martín

On Nov 04, 2005 01:36 PM, Jeremy Allison <[EMAIL PROTECTED]> wrote:

> On Fri, Nov 04, 2005 at 10:51:52AM -0300, Martin wrote:
> > 
> > How could we find it out? How could we get enough debugging level to
> > reach
> > this information?
> > 
> > When the smbd proccess stopped in D state the strace does not show
> > any line...
> 
> Attach to it with gdb and type "bt".
> 
> Jeremy.
> 

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-04 Thread Martin
On Friday 04 November 2005 02:26, Roger Eisenecher wrote:
> Hi all
>
> Martin Scandroli schrieb:
> Martin: Which kernel are you using? Do you use quota on your
> filesystem?
> >>>
> >>>This is a SLES9 running
> >>>kernel-bigsmp-2.6.5-7.201.i586
> >>>
> >>>We had also had problems with later version
> >>>kernel-bigsmp-2.6.5-7.193.i586
> >>>
> >>>Note: We decided to run 32bits kernel on the EM64T Intel platform.
> >>
> >>Can you reproduce this problem on a different filesystem than
> >>Reiser ? I'm trying to narrow down the problem here.
> >
> > Nop. It's quite difficult with 1200 users using it.
>
> Hmm... I will try it... we have "only" 500 Users... I think some sort of
> rsync -aPx --numeric-ids /mountpoint of reiserfs /mountpoint of a
> laaaggeee sratch disk, creating new filesystem and finally a rsync
> back to the new fs will do it? Or are there any better solutions?
>
> But it will take some days for all necessary steps like backup and so on...
>
> kindly regards
> rOger

Roger,

Don't you share storage with any other server, do you?

Could you detail your enviroment and provide a server description? to match 
something...

regards,
Martín

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-04 Thread Jeremy Allison
On Fri, Nov 04, 2005 at 10:51:52AM -0300, Martin wrote:
> 
> How could we find it out? How could we get enough debugging level to reach 
> this information?
> 
> When the smbd proccess stopped in D state the strace does not show any line...

Attach to it with gdb and type "bt".

Jeremy.
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-04 Thread Martin
On Thursday 03 November 2005 15:18, Jeremy Allison wrote:
> On Thu, Nov 03, 2005 at 05:16:49AM -0300, Martin wrote:
> > Roger,
> >
> > On Thursday 03 November 2005 03:22, Roger Eisenecher wrote:
> > > Hi all
> > >
> > > Martin schrieb:
> > > > 1TB with reiserfs in LVM
> > >
> > > We have a similar installation: Kernel 2.6.5-7.201-smp (the official
> > > kernel of SuSE 9.1 Professional) and we are using openldap and reiserfs
> > > too. Additonally we are using quota on the filesystem. Our server hangs
> > > often in this situation with a load of 350!!! The interesting part is
> > > that the cpu's are 92% idle. If we deactivate the quota subsystem the
> > > server will work for a longer time, but it could also happen that the
> > > load reaches 350... Only a reboot will solve this problem...
> >
> > This is exacltly our same sympthom.
> > We have already disable the quota without success. Still got the problem.
> >
> > > Martin: Which kernel are you using? Do you use quota on your
> > > filesystem?
> >
> > This is a SLES9 running
> > kernel-bigsmp-2.6.5-7.201.i586
> >
> > We had also had problems with later version
> > kernel-bigsmp-2.6.5-7.193.i586
> >
> > Note: We decided to run 32bits kernel on the EM64T Intel platform.
>
> Can you reproduce this problem on a different filesystem than
> Reiser ? I'm trying to narrow down the problem here.
>

Jeremy, 

Because the problem angle has changed to the file system and kernel I/O, and 
just to provide aditional info, I'll show you our mount options:

lvmp-> (reiserfs)   usrquota,grpquota,acl,user_xattr 
lvgroups-> (reiserfs)   grpquota,acl,user_xattr 
lvhomes -> (reiserfs)   usrquota,acl,user_xattr 

We were using "noatime" option too, but now is disable, and nothing has 
changed.

Note that it is connected to an external storage (EVA 5000) with qla2312 fiber 
channel. The qla2xxx module is being loaded with the following options:

options qla2xxx qlport_down_retry=30 ql2xfailover=1 ql2xloginretrycount=30 
ql2xlbType=0 

Hope this could be useful to give us a clue!

saludos, 
Martín


-- 
--
Mrtn

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-04 Thread Martin
On Wednesday 02 November 2005 19:34, Andrew Bartlett wrote:
> On Wed, 2005-11-02 at 18:53 -0300, Martin wrote:
> > On Monday 31 October 2005 18:27, Andrew Bartlett wrote:
> > > > >Now we are testing this configuration and waiting for the results.
> > > >
> > > > Bad luck... the load exploited again :(
> > >
> > > My gut feeling is that this is an e-Directory bug, but have you posted
> > > logfiles and traffic somewhere?  (Warning: unencrypted LDAP traffic
> > > will include passwords).
> >
> >  Dear  Andrew:
> >
> > We've been trying to test our environment as recommended in the last
> > posts. We switched the backend to openLDAP trying to discard the idea of
> > a bug in eDirectory. And here... the conclusion:
> >
> >
> > Backend switch:
> >
> >
> > 1) Sadly, Samba server still getting overloaded. The server doesn't
> > hang as in the previous scenario, but it gets extremely slow and there's
> > no way to provide service with it (load grows up to 60 or 70). It stops
> > responding a couple of minutes after the load gets to the limit.
> >
> > 2) There's an incredible amount of smbd childs in "D" state
> > (uninterruptable sleep), when the load starts to raise. It happens with
> > both backends (it's softer with openLDAP, but still unusable).
>
> This is a *very* important clue.  If this were an LDAP issue, then Samba
> should be in S state, and the ldap processes should be going nuts.
>
> > 3) The number of sleeping processes is considerably lower with
> > openLDAP.
> >
> > It seems that, something is beating the samba server because of a
> > bug perhaps, or a misconfiguration. The system is a little (but not
> > much) tolerant when openLDAP is used as backend (instead of eDir), but
> > the problems still no matter the directory service being used.
> > What do you think about a client triggering this behaviour some way?
>
> I now suspect the LDAP angle is a red herring, and I'm instead thinking
> 'kernel issue'.
>
> > Weird things found:
> >
> > I'll comment some lines about a couple of strange things i saw. They
> > may be completely unrelated to the main problem, but here they go just
> > in case.
> >
> > 1) Some times (according to what an strace attached to the parent
> > smbd process shows us), a user working on an XLS file starts a curious
> > behaviour in which the server tries to find a file that no longer exist
> > in a periodically basis (i.e: loop). We think the user deleted the file,
> > still an smbd process kept trying to access it. (it was complaining with
> > "file does not exist" messages permanently) (A few minutes later when the
> > loop was happening we went to the user desktop and found out he has
> > already turned off his machine!)
> >
> > We've captured service logs, straces, ps aux snapshots during the
> > load issue and a couple of lsofs.  (The whole samba's logs are more than
> > 1G and is impossible to determinate a fail, because the server still
> > responding until the load is too high to continue serving files)
> >
> > Is there any way to get some more verbosity? any different way of
> > debugging (gdb maybe?)?
>
> What would be interesting is to find out where each of those smbd
> processes is waiting.  Ie, what call is causing the kernel to put the
> process in D state.  Is it the same call, or a lot of different calls?

How could we find it out? How could we get enough debugging level to reach 
this information?

When the smbd proccess stopped in D state the strace does not show any line...

--
Mrtn

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Andrew Bartlett
On Fri, 2005-11-04 at 06:26 +0100, Roger Eisenecher wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hi all
> 
> Martin Scandroli schrieb:
> Martin: Which kernel are you using? Do you use quota on your
> filesystem?
> >>>
> >>>This is a SLES9 running
> >>>kernel-bigsmp-2.6.5-7.201.i586
> >>>
> >>>We had also had problems with later version
> >>>kernel-bigsmp-2.6.5-7.193.i586
> >>>
> >>>Note: We decided to run 32bits kernel on the EM64T Intel platform.
> >>
> >>Can you reproduce this problem on a different filesystem than
> >>Reiser ? I'm trying to narrow down the problem here.
> > 
> > Nop. It's quite difficult with 1200 users using it.
> 
> Hmm... I will try it... we have "only" 500 Users... I think some sort of
> rsync -aPx --numeric-ids /mountpoint of reiserfs /mountpoint of a
> laaaggeee sratch disk, creating new filesystem and finally a rsync
> back to the new fs will do it? Or are there any better solutions?
> 
> But it will take some days for all necessary steps like backup and so on...

That's pretty much the only way to do it.  Bonus points if you can keep
both online for kernel debugging if it is shown to be the fs.  

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Authentication Developer, Samba Team   http://samba.org
Student Network Administrator, Hawker College  http://hawkerc.net


signature.asc
Description: This is a digitally signed message part
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba

Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Roger Eisenecher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all

Martin Scandroli schrieb:
Martin: Which kernel are you using? Do you use quota on your
filesystem?
>>>
>>>This is a SLES9 running
>>>kernel-bigsmp-2.6.5-7.201.i586
>>>
>>>We had also had problems with later version
>>>kernel-bigsmp-2.6.5-7.193.i586
>>>
>>>Note: We decided to run 32bits kernel on the EM64T Intel platform.
>>
>>Can you reproduce this problem on a different filesystem than
>>Reiser ? I'm trying to narrow down the problem here.
> 
> Nop. It's quite difficult with 1200 users using it.

Hmm... I will try it... we have "only" 500 Users... I think some sort of
rsync -aPx --numeric-ids /mountpoint of reiserfs /mountpoint of a
laaaggeee sratch disk, creating new filesystem and finally a rsync
back to the new fs will do it? Or are there any better solutions?

But it will take some days for all necessary steps like backup and so on...

kindly regards
rOger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDavDzpF3l9rYt4bARAsVsAJ4toYFdfWyBZogs9MeVOsCgh889fACfWX+A
AFhRWw4mVJUu6IwRji3MbVM=
=bBai
-END PGP SIGNATURE-
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Martin Scandroli
On Nov 03, 2005 03:18 PM, Jeremy Allison <[EMAIL PROTECTED]> wrote:

> On Thu, Nov 03, 2005 at 05:16:49AM -0300, Martin wrote:
> > Roger,
> > 
> > On Thursday 03 November 2005 03:22, Roger Eisenecher wrote:
> > > Hi all
> > >
> > > Martin schrieb:
> > > > 1TB with reiserfs in LVM
> > >
> > > We have a similar installation: Kernel 2.6.5-7.201-smp (the
> > > official
> > > kernel of SuSE 9.1 Professional) and we are using openldap and
> > > reiserfs
> > > too. Additonally we are using quota on the filesystem. Our server
> > > hangs
> > > often in this situation with a load of 350!!! The interesting part
> > > is
> > > that the cpu's are 92% idle. If we deactivate the quota subsystem
> > > the
> > > server will work for a longer time, but it could also happen that
> > > the
> > > load reaches 350... Only a reboot will solve this problem...
> > This is exacltly our same sympthom.
> > We have already disable the quota without success. Still got the
> > problem.
> > 
> > 
> > > Martin: Which kernel are you using? Do you use quota on your
> > > filesystem?
> > >
> > 
> > This is a SLES9 running
> > kernel-bigsmp-2.6.5-7.201.i586
> > 
> > We had also had problems with later version
> > kernel-bigsmp-2.6.5-7.193.i586
> > 
> > Note: We decided to run 32bits kernel on the EM64T Intel platform.
> 
> Can you reproduce this problem on a different filesystem than
> Reiser ? I'm trying to narrow down the problem here.
Nop. It's quite difficult with 1200 users using it.

> 
> Jeremy.
> 

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Jeremy Allison
On Thu, Nov 03, 2005 at 07:39:05PM +0100, Roger Eisenecher wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Jeremy Allison schrieb:
> >>>Martin: Which kernel are you using? Do you use quota on your filesystem?
> >>>
> >>
> >>This is a SLES9 running
> >>kernel-bigsmp-2.6.5-7.201.i586
> >>
> >>We had also had problems with later version
> >>kernel-bigsmp-2.6.5-7.193.i586
> >>
> >>Note: We decided to run 32bits kernel on the EM64T Intel platform.
> > 
> > 
> > Can you reproduce this problem on a different filesystem than
> > Reiser ? I'm trying to narrow down the problem here.
> 
> Is there a simple way to convert from reiserfs to another fs? ;-)

Nope, sorry.

Jeremy.
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Roger Eisenecher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jeremy Allison schrieb:
>>>Martin: Which kernel are you using? Do you use quota on your filesystem?
>>>
>>
>>This is a SLES9 running
>>kernel-bigsmp-2.6.5-7.201.i586
>>
>>We had also had problems with later version
>>kernel-bigsmp-2.6.5-7.193.i586
>>
>>Note: We decided to run 32bits kernel on the EM64T Intel platform.
> 
> 
> Can you reproduce this problem on a different filesystem than
> Reiser ? I'm trying to narrow down the problem here.

Is there a simple way to convert from reiserfs to another fs? ;-)

kindly regards
rOger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDallJpF3l9rYt4bARAqipAJwNBtDWxF6f+FjehkgIHAREyf6VAwCePdnH
RXL/+J2ouMQhUad03R+7a4g=
=1bKB
-END PGP SIGNATURE-
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Jeremy Allison
On Thu, Nov 03, 2005 at 05:16:49AM -0300, Martin wrote:
> Roger,
> 
> On Thursday 03 November 2005 03:22, Roger Eisenecher wrote:
> > Hi all
> >
> > Martin schrieb:
> > > 1TB with reiserfs in LVM
> >
> > We have a similar installation: Kernel 2.6.5-7.201-smp (the official
> > kernel of SuSE 9.1 Professional) and we are using openldap and reiserfs
> > too. Additonally we are using quota on the filesystem. Our server hangs
> > often in this situation with a load of 350!!! The interesting part is
> > that the cpu's are 92% idle. If we deactivate the quota subsystem the
> > server will work for a longer time, but it could also happen that the
> > load reaches 350... Only a reboot will solve this problem...
> This is exacltly our same sympthom.
> We have already disable the quota without success. Still got the problem.
> 
> 
> > Martin: Which kernel are you using? Do you use quota on your filesystem?
> >
> 
> This is a SLES9 running
> kernel-bigsmp-2.6.5-7.201.i586
> 
> We had also had problems with later version
> kernel-bigsmp-2.6.5-7.193.i586
> 
> Note: We decided to run 32bits kernel on the EM64T Intel platform.

Can you reproduce this problem on a different filesystem than
Reiser ? I'm trying to narrow down the problem here.

Jeremy.
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Martin
On Thursday 03 November 2005 10:04, Roger Eisenecher wrote:
> Hi Martin, hi list
>
> Martin schrieb:
> > Roger,
> >
> > On Thursday 03 November 2005 03:22, Roger Eisenecher wrote:
> >>Martin schrieb:
> >>>1TB with reiserfs in LVM
> >>
> >>We have a similar installation: Kernel 2.6.5-7.201-smp (the official
> >>kernel of SuSE 9.1 Professional) and we are using openldap and reiserfs
> >>too. Additonally we are using quota on the filesystem. Our server hangs
> >>often in this situation with a load of 350!!! The interesting part is
> >>that the cpu's are 92% idle. If we deactivate the quota subsystem the
> >>server will work for a longer time, but it could also happen that the
> >>load reaches 350... Only a reboot will solve this problem...
> >
> > This is exacltly our same sympthom.
> > We have already disable the quota without success. Still got the problem.
> >
> >>Martin: Which kernel are you using? Do you use quota on your filesystem?
> >
> > This is a SLES9 running
> > kernel-bigsmp-2.6.5-7.201.i586
> >
> > We had also had problems with later version
> > kernel-bigsmp-2.6.5-7.193.i586
> >
> > Note: We decided to run 32bits kernel on the EM64T Intel platform.
>
> Interesting: We use also a 32bit kernel with our dual opteron server.

We switched to 32 bits because of an issue we were having with the Samba 
version shipped with 64 Bits SLES9.
Somehow, when trying to integrate a Windows workstation to the domain, 
the attribute's sambaPwMustChange value attempted to be set to 
9223372036854775807 (If i can recall, it's the bigger number you can 
save into a 64 bits single signed integer) and samba complained because 
it wasn't a valid number to store into that kind of attribute (LDAP schema 
definition limit?).
This problem never showed up with 32 bits. Actually the value samba 
writes is 2147483647. I didn's check it out, but seems to be a common 16 
bits integer.

(by the way, there was an old idea to have an eDirectory replica on the same 
server, but thoose days it was not certified for 64bits)

> Did you experience other symptoms like that the file system does not
> respond to shell commands like ls?
I think it's all all product of the same problem: extrematly high load. 
Probably you are experimenting some delay querying LDAP server. Try "ls -n", 
in this way, you wouldn't resolve the files owner name.


--
Mrtn

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Roger Eisenecher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Martin, hi list

Martin schrieb:
> Roger,
> 
> On Thursday 03 November 2005 03:22, Roger Eisenecher wrote:
>>Martin schrieb:
>>
>>>1TB with reiserfs in LVM
>>
>>We have a similar installation: Kernel 2.6.5-7.201-smp (the official
>>kernel of SuSE 9.1 Professional) and we are using openldap and reiserfs
>>too. Additonally we are using quota on the filesystem. Our server hangs
>>often in this situation with a load of 350!!! The interesting part is
>>that the cpu's are 92% idle. If we deactivate the quota subsystem the
>>server will work for a longer time, but it could also happen that the
>>load reaches 350... Only a reboot will solve this problem...
> 
> This is exacltly our same sympthom.
> We have already disable the quota without success. Still got the problem.
> 
>>Martin: Which kernel are you using? Do you use quota on your filesystem?
>>
> This is a SLES9 running
> kernel-bigsmp-2.6.5-7.201.i586
> 
> We had also had problems with later version
> kernel-bigsmp-2.6.5-7.193.i586
> 
> Note: We decided to run 32bits kernel on the EM64T Intel platform.

Interesting: We use also a 32bit kernel with our dual opteron server.
Did you experience other symptoms like that the file system does not
respond to shell commands like ls?

kindly regards
rOger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDagrRpF3l9rYt4bARAj33AJ4lm2gLF3EwweIc/hTvCI5FLjajzwCgjeHR
bkPq2d3cxkZ1f02l6YR9xcM=
=dpAN
-END PGP SIGNATURE-
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-03 Thread Martin
Roger,

On Thursday 03 November 2005 03:22, Roger Eisenecher wrote:
> Hi all
>
> Martin schrieb:
> > 1TB with reiserfs in LVM
>
> We have a similar installation: Kernel 2.6.5-7.201-smp (the official
> kernel of SuSE 9.1 Professional) and we are using openldap and reiserfs
> too. Additonally we are using quota on the filesystem. Our server hangs
> often in this situation with a load of 350!!! The interesting part is
> that the cpu's are 92% idle. If we deactivate the quota subsystem the
> server will work for a longer time, but it could also happen that the
> load reaches 350... Only a reboot will solve this problem...
This is exacltly our same sympthom.
We have already disable the quota without success. Still got the problem.


> Martin: Which kernel are you using? Do you use quota on your filesystem?
>

This is a SLES9 running
kernel-bigsmp-2.6.5-7.201.i586

We had also had problems with later version
kernel-bigsmp-2.6.5-7.193.i586

Note: We decided to run 32bits kernel on the EM64T Intel platform.

--
Mrtn

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-02 Thread Roger Eisenecher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi all

Martin schrieb:
> 1TB with reiserfs in LVM

We have a similar installation: Kernel 2.6.5-7.201-smp (the official
kernel of SuSE 9.1 Professional) and we are using openldap and reiserfs
too. Additonally we are using quota on the filesystem. Our server hangs
often in this situation with a load of 350!!! The interesting part is
that the cpu's are 92% idle. If we deactivate the quota subsystem the
server will work for a longer time, but it could also happen that the
load reaches 350... Only a reboot will solve this problem...

Martin: Which kernel are you using? Do you use quota on your filesystem?

kindly regards
rOger

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDaayqpF3l9rYt4bARAjEvAKCdbxqnPdOzSYknCrhcHsqBk2wepACeLEkX
1NBcWZ3DrXmncaQz+qQuXdM=
=s2NN
-END PGP SIGNATURE-
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-02 Thread Martin
On Wednesday 02 November 2005 19:50, Jeremy Allison wrote:
> On Wed, Nov 02, 2005 at 06:53:36PM -0300, Martin wrote:
> > #> strace -f -p 
> >
> > RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 18
> > fstat64(18, {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
> > fcntl64(18, F_SETFD, FD_CLOEXEC)= 0
> > getdents64(18, /* 4 entries */, 4096)   = 136
> > getdents64(18, /* 0 entries */, 4096)   = 0

[ ... ]

> > write(45, "  reply_unlink : Estructura_Cent"..., 98) = 98
> > stat64("Estructura_Central/marketing/Medios/Victor/insitucional
> > 2005/INVERSION", {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
> > stat64("Estructura_Central/marketing/Medios/Victor/insitucional
> > 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or
> > directory)
> > stat64("Estructura_Central/marketing/Medios/Victor/insitucional
> > 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or
> > directory)
>
> What filesystem is this ?

1TB with reiserfs in LVM

--
Mrtn

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-02 Thread Martin Scandroli
On Wednesday 02 November 2005 19:50, Jeremy Allison wrote:
> On Wed, Nov 02, 2005 at 06:53:36PM -0300, Martin wrote:
> > #> strace -f -p 
> >
> > RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 18

[ ... ]

> > 2005/INVERSION", {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
> > stat64("Estructura_Central/marketing/Medios/Victor/insitucional
> > 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or
> > directory)
> > stat64("Estructura_Central/marketing/Medios/Victor/insitucional
> > 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or
> > directory)
>
> What filesystem is this ?
1TB with reiserfs in LVM

--
Mrtn

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-02 Thread Jeremy Allison
On Wed, Nov 02, 2005 at 06:53:36PM -0300, Martin wrote:
> 
> #> strace -f -p 
> 
> RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 18
> fstat64(18, {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
> fcntl64(18, F_SETFD, FD_CLOEXEC)= 0
> getdents64(18, /* 4 entries */, 4096)   = 136
> getdents64(18, /* 0 entries */, 4096)   = 0
> close(18)   = 0
> lstat64("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
> directory)
> time(NULL)  = 1130795339
> geteuid32() = 12510
> write(45, "[2005/10/31 18:48:59, 3] smbd/pr"..., 58) = 58
> geteuid32() = 12510
> write(45, "  Transaction 6293801 of length "..., 36) = 36
> gettimeofday({1130795339, 113315}, NULL) = 0
> time(NULL)  = 1130795339
> geteuid32() = 12510
> write(45, "[2005/10/31 18:48:59, 3] smbd/pr"..., 60) = 60
> geteuid32() = 12510
> write(45, "  switch message SMBunlink (pid "..., 54) = 54
> time(NULL)  = 1130795339
> geteuid32() = 12510
> write(45, "[2005/10/31 18:48:59, 3] smbd/re"..., 57) = 57
> geteuid32() = 12510
> write(45, "  reply_unlink : Estructura_Cent"..., 98) = 98
> stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION", {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
> stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
> directory)
> stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
> directory)
> open("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION", 
> O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 18
> fstat64(18, {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
> fcntl64(18, F_SETFD, FD_CLOEXEC)= 0
> getdents64(18, /* 4 entries */, 4096)   = 136
> getdents64(18, /* 0 entries */, 4096)   = 0
> close(18)   = 0
> lstat64("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
> directory)
> time(NULL)  = 1130795339
> geteuid32() = 12510
> write(45, "[2005/10/31 18:48:59, 3] smbd/pr"..., 58) = 58
> geteuid32() = 12510
> write(45, "  Transaction 6293802 of length "..., 36) = 36
> gettimeofday({1130795339, 115220}, NULL) = 0
> time(NULL)  = 1130795339
> geteuid32() = 12510
> write(45, "[2005/10/31 18:48:59, 3] smbd/pr"..., 60) = 60
> geteuid32() = 12510
> write(45, "  switch message SMBunlink (pid "..., 54) = 54
> time(NULL)  = 1130795339
> geteuid32() = 12510
> write(45, "[2005/10/31 18:48:59, 3] smbd/re"..., 57) = 57
> geteuid32() = 12510
> write(45, "  reply_unlink : Estructura_Cent"..., 98) = 98
> stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION", {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
> stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
> directory)
> stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
> 2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
> directory)

What filesystem is this ?

Jeremy.
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-02 Thread Andrew Bartlett
On Wed, 2005-11-02 at 18:53 -0300, Martin wrote:
> On Monday 31 October 2005 18:27, Andrew Bartlett wrote:
> > > >Now we are testing this configuration and waiting for the results.
> > >
> > > Bad luck... the load exploited again :(
> >
> > My gut feeling is that this is an e-Directory bug, but have you posted
> > logfiles and traffic somewhere?  (Warning: unencrypted LDAP traffic will
> > include passwords).
> >
> 
>  Dear  Andrew:
> 
> We've been trying to test our environment as recommended in the last 
> posts. We switched the backend to openLDAP trying to discard the idea of 
> a bug in eDirectory. And here... the conclusion:
> 
> 
> Backend switch:
> 
> 
> 1) Sadly, Samba server still getting overloaded. The server doesn't 
> hang as in the previous scenario, but it gets extremely slow and there's 
> no way to provide service with it (load grows up to 60 or 70). It stops 
> responding a couple of minutes after the load gets to the limit.
> 
> 2) There's an incredible amount of smbd childs in "D" state 
> (uninterruptable sleep), when the load starts to raise. It happens with 
> both backends (it's softer with openLDAP, but still unusable).

This is a *very* important clue.  If this were an LDAP issue, then Samba
should be in S state, and the ldap processes should be going nuts.  

> 3) The number of sleeping processes is considerably lower with openLDAP.
> 
> It seems that, something is beating the samba server because of a 
> bug perhaps, or a misconfiguration. The system is a little (but not 
> much) tolerant when openLDAP is used as backend (instead of eDir), but 
> the problems still no matter the directory service being used.
> What do you think about a client triggering this behaviour some way?

I now suspect the LDAP angle is a red herring, and I'm instead thinking
'kernel issue'.  

> 
> Weird things found:
> 
> I'll comment some lines about a couple of strange things i saw. They 
> may be completely unrelated to the main problem, but here they go just 
> in case.
> 
> 1) Some times (according to what an strace attached to the parent 
> smbd process shows us), a user working on an XLS file starts a curious 
> behaviour in which the server tries to find a file that no longer exist 
> in a periodically basis (i.e: loop). We think the user deleted the file, 
> still an smbd process kept trying to access it. (it was complaining with 
> "file does not exist" messages permanently) (A few minutes later when the 
> loop 
> was happening we went to the user desktop and found out he has already turned 
> off his machine!)
> 
> We've captured service logs, straces, ps aux snapshots during the 
> load issue and a couple of lsofs.  (The whole samba's logs are more than 1G 
> and is impossible to determinate a fail, because the server still responding 
> until the load is too high to continue serving files)
> 
> Is there any way to get some more verbosity? any different way of 
> debugging (gdb maybe?)?

What would be interesting is to find out where each of those smbd
processes is waiting.  Ie, what call is causing the kernel to put the
process in D state.  Is it the same call, or a lot of different calls? 

Given the load problems, I presume the processes are chewing CPU time?

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Samba Developer, SuSE Labs, Novell Inc.http://suse.de
Authentication Developer, Samba Team   http://samba.org
Student Network Administrator, Hawker College  http://hawkerc.net


signature.asc
Description: This is a digitally signed message part
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba

Re: [Samba] Overloaded samba server. Is it a bug?

2005-11-02 Thread Martin
On Monday 31 October 2005 18:27, Andrew Bartlett wrote:
> > >Now we are testing this configuration and waiting for the results.
> >
> > Bad luck... the load exploited again :(
>
> My gut feeling is that this is an e-Directory bug, but have you posted
> logfiles and traffic somewhere?  (Warning: unencrypted LDAP traffic will
> include passwords).
>

 Dear  Andrew:

We've been trying to test our environment as recommended in the last 
posts. We switched the backend to openLDAP trying to discard the idea of 
a bug in eDirectory. And here... the conclusion:


Backend switch:


1) Sadly, Samba server still getting overloaded. The server doesn't 
hang as in the previous scenario, but it gets extremely slow and there's 
no way to provide service with it (load grows up to 60 or 70). It stops 
responding a couple of minutes after the load gets to the limit.

2) There's an incredible amount of smbd childs in "D" state 
(uninterruptable sleep), when the load starts to raise. It happens with 
both backends (it's softer with openLDAP, but still unusable).

3) The number of sleeping processes is considerably lower with openLDAP.

It seems that, something is beating the samba server because of a 
bug perhaps, or a misconfiguration. The system is a little (but not 
much) tolerant when openLDAP is used as backend (instead of eDir), but 
the problems still no matter the directory service being used.
What do you think about a client triggering this behaviour some way?


Weird things found:

I'll comment some lines about a couple of strange things i saw. They 
may be completely unrelated to the main problem, but here they go just 
in case.

1) Some times (according to what an strace attached to the parent 
smbd process shows us), a user working on an XLS file starts a curious 
behaviour in which the server tries to find a file that no longer exist 
in a periodically basis (i.e: loop). We think the user deleted the file, 
still an smbd process kept trying to access it. (it was complaining with 
"file does not exist" messages permanently) (A few minutes later when the loop 
was happening we went to the user desktop and found out he has already turned 
off his machine!)

We've captured service logs, straces, ps aux snapshots during the 
load issue and a couple of lsofs.  (The whole samba's logs are more than 1G 
and is impossible to determinate a fail, because the server still responding 
until the load is too high to continue serving files)

Is there any way to get some more verbosity? any different way of 
debugging (gdb maybe?)?

Thanks in advance


#> strace -f -p 

RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 18
fstat64(18, {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
fcntl64(18, F_SETFD, FD_CLOEXEC)= 0
getdents64(18, /* 4 entries */, 4096)   = 136
getdents64(18, /* 0 entries */, 4096)   = 0
close(18)   = 0
lstat64("Estructura_Central/marketing/Medios/Victor/insitucional 
2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
directory)
time(NULL)  = 1130795339
geteuid32() = 12510
write(45, "[2005/10/31 18:48:59, 3] smbd/pr"..., 58) = 58
geteuid32() = 12510
write(45, "  Transaction 6293801 of length "..., 36) = 36
gettimeofday({1130795339, 113315}, NULL) = 0
time(NULL)  = 1130795339
geteuid32() = 12510
write(45, "[2005/10/31 18:48:59, 3] smbd/pr"..., 60) = 60
geteuid32() = 12510
write(45, "  switch message SMBunlink (pid "..., 54) = 54
time(NULL)  = 1130795339
geteuid32() = 12510
write(45, "[2005/10/31 18:48:59, 3] smbd/re"..., 57) = 57
geteuid32() = 12510
write(45, "  reply_unlink : Estructura_Cent"..., 98) = 98
stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
2005/INVERSION", {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
directory)
stat64("Estructura_Central/marketing/Medios/Victor/insitucional 
2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
directory)
open("Estructura_Central/marketing/Medios/Victor/insitucional 2005/INVERSION", 
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 18
fstat64(18, {st_mode=S_IFDIR|0770, st_size=128, ...}) = 0
fcntl64(18, F_SETFD, FD_CLOEXEC)= 0
getdents64(18, /* 4 entries */, 4096)   = 136
getdents64(18, /* 0 entries */, 4096)   = 0
close(18)   = 0
lstat64("Estructura_Central/marketing/Medios/Victor/insitucional 
2005/INVERSION/cao 2.xls", 0xbfffcec0) = -1 ENOENT (No such file or 
directory)
time(NULL)  = 1130795339
geteuid32() 

Re: [Samba] Overloaded samba server. Is it a bug?

2005-10-31 Thread Andrew Bartlett
On Mon, 2005-10-31 at 12:30 -0300, Martin wrote:
> On Friday 28 October 2005 23:14, Andrew Bartlett wrote:
> > > On Thu, 2005-10-27 at 03:12 -0300, Martin Scandroli wrote:
> ...
> > > > of
> > > > seconds!, and it keeps growing till the server dies. We couldn't find
> ...
> >
> >Now we are testing this configuration and waiting for the results.
> 
> Bad luck... the load exploited again :( 
> Couriously, the samba server has never reached the second ldap edirectory. 
> (we 
> were monitoring it with a tcpdump tool)

My gut feeling is that this is an e-Directory bug, but have you posted
logfiles and traffic somewhere?  (Warning: unencrypted LDAP traffic will
include passwords).

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Samba Developer, SuSE Labs, Novell Inc.http://suse.de
Authentication Developer, Samba Team   http://samba.org
Student Network Administrator, Hawker College  http://hawkerc.net


signature.asc
Description: This is a digitally signed message part
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba

Re: [Samba] Overloaded samba server. Is it a bug?

2005-10-31 Thread Martin
On Friday 28 October 2005 23:14, Andrew Bartlett wrote:
> > On Thu, 2005-10-27 at 03:12 -0300, Martin Scandroli wrote:
...
> > > of
> > > seconds!, and it keeps growing till the server dies. We couldn't find
...
>
>Now we are testing this configuration and waiting for the results.

Bad luck... the load exploited again :( 
Couriously, the samba server has never reached the second ldap edirectory. (we 
were monitoring it with a tcpdump tool)

Martin.

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-10-31 Thread Martin
On Friday 28 October 2005 23:14, Andrew Bartlett wrote:
> > 2) Root user was no longer recognized, (we still trying to figure out
> > why, the user's been added to the tree, but nothing changed) so we used
> > the
> > new role based administration provided by samba 3 as a workarround
> > (SeMachinAccount...), and no more troubles about it.
>
> Yep.
Why?

> > Something happens in a determined moment of the day (rush hour).
> > Everything is running smoothly (0.3 - 0.4 of load average) when the load
> > start to grow indefinitely!!. It raises from 0.3 to 50 in a matter
> > of
> > seconds!, and it keeps growing till the server dies. We couldn't find
> > the
> > reason of this, but it happens in a two hors interval. Before and after
> > this
> > interval, there are no errors of any kind.
>
> My guess is this:  Your LDAP server is getting backed up because of a
> bug, perhaps invoving a lock in the database.  Then Samba processes
> start backing up, trying to access LDAP, which is wedged.  They keep
> hammering at the ldap server in the backoff pattern, then fail (causing
> the client to try again).
>
> Because the questions are not being answered, the load goes though the
> roof, and this causes the LDAP sever more pain.
>
> One option is to separate your LDAP server from your samba server, and
> have more than one LDAP server available per Samba server.  This allows
> Samba to use the other server, with the local one recovers (assuming
> some short-term lock).
The LDAP Server IS running on dedicated machine (Actually a Linux eDirectory 
Cluster with DirXML)

As recomended at 
http://www.samba.org/samba/docs/man/Samba-Guide/2000users.html#ch7dualLDAP, 
we have configured dual ldap backend to provide LDAP failover, with a windows 
eDir réplica, but we still without full undertanding your terminology 
"...backing up..."

Now we are testing this configuration and waiting for the results.

--
Martin

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


RE: [Samba] Overloaded samba server. Is it a bug?

2005-10-31 Thread Laurenz, Dirk
Hi,

-|  > Is there any place in samba where I shoulb be looking?
-|  > Any info/pointers would be much appreciated.

we don't have any problems with memberships in more than two hundred groups.
OS: SuSE SLES 9, Samba 3.0.14a


Mit freundlichem Gruß,



Dirk Laurenz
Systems Engineer

Fujitsu Siemens Computers
S CE DE SE PS N/O
Sales Central Europe Deutschland 
Professional Service Nord / Ost

Hildesheimer Strasse 25
30880 Laatzen
Germany

Telephone:  +49 (511) 84 89 - 18 08
Telefax:+49 (511) 84 89 - 25 18 08
Mobile: +49 (170) 22 10 781
Email:  mailto:[EMAIL PROTECTED]
Internet:   http://www.fujitsu-siemens.com
http://www.fujitsu-siemens.de/services/index.html
***
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba

Re: [Samba] Overloaded samba server. Is it a bug?

2005-10-28 Thread Martin Scandroli
On Fri Oct 28 14:44:02 GMT 2005 Bruno Guerreiro wrote:

> I've made this question over and over, but still no answer till now.
> So here goes again, maybe I have better luck this time.
> Is there any limitation to the number of groups a samba user may
> belong?
> I've found out that if the user belongs to more than 60 to 70 groups,
> group-based share access stop working. 
> From another post in this ml, i've found out that kernel 2.4.xx had a
> 32
> group membership limitation, but i'm using 2.6.xx which has a 65536
> groups
> limit.
> Is there any place in samba where I shoulb be looking?
> Any info/pointers would be much appreciated.

Have you check with getent command if your platform response correctly?
try "getent group "
It should returns a members list like a line from /etc/group.

If it does not work, check your entry in nsswitch.conf and replace
"passwd compat" by "passwd ldap" (do the same for the group and maybe
for shadow)

Another thing you could try is use the recently "ldapsam:trusted = yes"
option... take care of the considerations to make it work!


Saludos, 
Martín

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


RE: [Samba] Overloaded samba server. Is it a bug?

2005-10-28 Thread Martin Scandroli

On Oct 28, 2005 02:11 PM, MJBarber wrote:

> I am running Suse 9.2 Pro in a corporate environment with 3.0.14a and
> it works great.Just my 0.02...

Well, when Samba is running before the load begins to rise, it's around
0.50 (with aproximately 1000 users logued in and 500 in high activity)

> If you truly think this is a samba problem try a different version to
> either replicate the issue or to have it point to a different piece of
> the
> puzzle. What is your complete config?

We are using the Samba 3.0.20b because we need a new feature included in
this version. (SeTakeOwnerShipPrivilege) We haven't been able to use
root user as administrator of extended file system ACLs because the
ldapsam:trusted is preventing us from using it.
(NT_STATUS_UNSUCCESSFUL)

> You said the load went sky high in a matter of seconds...do you see
> which
> process is running wild (smbd, nmbd, winbindd...).

We've done an strace to the partent process of all smbds (it follows all
the forks) and we didn't see nothing relevant.

Here is our smb.conf, and winbindd is not being used.

srvsmb02:~ # cat /etc/samba/smb.conf
[global]
workgroup = DOMAIN
passdb backend = ldapsam:ldap://10.10.6.130
netbios name = SRVSMBFS
netbios aliases = SRVSMBPS
ldap admin dn = cn=admin,o=domain
ldap suffix = ou=ar,o=domain
ldap group suffix = ou=grupos_openldap
ldap machine suffix = ou=maquinas
ldap timeout = 2
idmap backend = ldap:ldap://10.10.6.130
idmap uid = 1-4
idmap gid = 1-4
unix charset = ISO8859-15
add machine script = /usr/local/sbin/smbldap-useradd -w %u
domain logons = yes
domain master = yes
local master = yes
show add printer wizard = no
bind interfaces only = yes
interfaces = 10.10.6.75/24
username level = 15
username map = /etc/samba/smbusers
ldapsam:trusted = yes
preferred master = yes
ldap ssl = no
wins support = yes
printing = cups
printcap name = cups
printcap cache time = 750
cups options = raw
map to guest = Bad User
logon path =
logon home = \\%L\%U\.9xprofile
logon drive = H:
os level = 255
log level = 3
socket options = IPTOS_LOWDELAY TCP_NODELAY
cups server = 10.10.6.78
veto files =

/*.eml/*.nws/riched20.dll/*.{*}/aquota.user/aquota.group/.msprofile/lost+found/
hide files = /aquota.user/aquota.group/.msprofile/
enable privileges = yes
acl group control = yes
logon script = ARRANQUE.BAT
inherit owner = yes
inherit acls = yes
disable spoolss = yes
log file = /var/log/samba/machines/log.%m
[homes]
comment = Home Directories
valid users = %S
browseable = No
read only = No
[profiles]
comment = Network Profiles Service
path = %H
read only = No
store dos attributes = Yes
create mask = 0600
directory mask = 0700
browseable = no
[printers]
comment = All Printers
path = /var/tmp
printable = Yes
create mask = 0600
browseable = No
[netlogon]
comment = netlogon service
path = /var/lib/samba/netlogon
browseable = no
guest ok = . Continue
---8<---8<


Thanks for your interest,
Martín



--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-10-28 Thread Andrew Bartlett
On Thu, 2005-10-27 at 03:12 -0300, Martin Scandroli wrote:
> Experts,

> The implementation of this feature produced some other problems (we've
> found workarrounds but i'll comment them just to provide some feedback).
> 
> 1) The samba server used to die seconds after it was started. 
> Something about the nobody user and it's primary group prevented it from
> working in a proper manner. We solved this inconvinient by adding de
> user
> nobody and it's corresponding primary group to the backend.

Yep, this is a known requirement for that feature.  I'm not sure it
should die, but it can't work without all the accounts it will deal with
in LDAP.  (Otherwise we have to use the slower method, which is why you
turned this on in the first place).

> 2) Root user was no longer recognized, (we still trying to figure out
> why, the user's been added to the tree, but nothing changed) so we used
> the
> new role based administration provided by samba 3 as a workarround 
> (SeMachinAccount...), and no more troubles about it.

Yep.

> 
> 
> 3)THIS ISSUE IS KILLING US!!!
> 
> Something happens in a determined moment of the day (rush hour).
> Everything is running smoothly (0.3 - 0.4 of load average) when the load
> start to grow indefinitely!!. It raises from 0.3 to 50 in a matter
> of
> seconds!, and it keeps growing till the server dies. We couldn't find
> the
> reason of this, but it happens in a two hors interval. Before and after
> this
> interval, there are no errors of any kind.
> 
> I'll paste some log errors (just the ones i saw). I don't think 
> they're the cause of our problems, buy you're the experts.
> 
> Any clue? do you need me to gather some kind of information? any DoS
> bug reported for this samba version?

My guess is this:  Your LDAP server is getting backed up because of a
bug, perhaps invoving a lock in the database.  Then Samba processes
start backing up, trying to access LDAP, which is wedged.  They keep
hammering at the ldap server in the backoff pattern, then fail (causing
the client to try again).

Because the questions are not being answered, the load goes though the
roof, and this causes the LDAP sever more pain.

One option is to separate your LDAP server from your samba server, and
have more than one LDAP server available per Samba server.  This allows
Samba to use the other server, with the local one recovers (assuming
some short-term lock).

Andrew Bartlett

-- 
Andrew Bartletthttp://samba.org/~abartlet/
Samba Developer, SuSE Labs, Novell Inc.http://suse.de
Authentication Developer, Samba Team   http://samba.org
Student Network Administrator, Hawker College  http://hawkerc.net


signature.asc
Description: This is a digitally signed message part
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba

RE: [Samba] Overloaded samba server. Is it a bug?

2005-10-28 Thread Martin Scandroli


> I am running Suse 9.2 Pro in a corporate environment with 3.0.14a and
> it
> works great.

> Just my 0.02...  
Well, when Samba is running before the load begins to rise, it's around
0.50 (with aproximately 1000 users logued in and 500 in high activity)

> If you truly think this is a samba problem try a different version to
> either replicate the issue or to have it point to a different piece of
> the
> puzzle.  What is your complete config? 
We are using the Samba 3.0.20b because we need a new feature included in
this version. (SeTakeOwnerShipPrivilege) We haven't been able to use
root user as administrator of extended file system ACLs because the
ldapsam:trusted is preventing us from using it.
(NT_STATUS_UNSUCCESSFUL)

> You said the load went sky high in a matter of seconds...do you see
> which
> process is running wild (smbd, nmbd, winbindd...).
We've done an strace to the partent process of all smbds (it follows all
the forks) and we didn't see nothing relevant.

Here is our smb.conf, and winbindd is not being used.

srvsmb02:~ # cat /etc/samba/smb.conf
[global]
workgroup = DOMAIN
passdb backend = ldapsam:ldap://10.10.6.130
netbios name = SRVSMBFS
netbios aliases = SRVSMBPS
ldap admin dn = cn=admin,o=domain
ldap suffix = ou=ar,o=domain
ldap group suffix = ou=grupos_openldap
ldap machine suffix = ou=maquinas
ldap timeout = 2
idmap backend = ldap:ldap://10.10.6.130
idmap uid = 1-4
idmap gid = 1-4
unix charset = ISO8859-15
add machine script = /usr/local/sbin/smbldap-useradd -w %u
domain logons = yes
domain master = yes
local master = yes
show add printer wizard = no
bind interfaces only = yes
interfaces = 10.10.6.75/24
username level = 15
username map = /etc/samba/smbusers
ldapsam:trusted = yes
preferred master = yes
ldap ssl = no
wins support = yes
printing = cups
printcap name = cups
printcap cache time = 750
cups options = raw
map to guest = Bad User
logon path =
logon home = \\%L\%U\.9xprofile
logon drive = H:
os level = 255
log level = 3
socket options = IPTOS_LOWDELAY TCP_NODELAY
cups server = 10.10.6.78
veto files =

/*.eml/*.nws/riched20.dll/*.{*}/aquota.user/aquota.group/.msprofile/lost+found/
hide files = /aquota.user/aquota.group/.msprofile/
enable privileges = yes
acl group control = yes
logon script = ARRANQUE.BAT
inherit owner = yes
inherit acls = yes
disable spoolss = yes
log file = /var/log/samba/machines/log.%m
[homes]
comment = Home Directories
valid users = %S
browseable = No
read only = No
[profiles]
comment = Network Profiles Service
path = %H
read only = No
store dos attributes = Yes
create mask = 0600
directory mask = 0700
browseable = no
[printers]
comment = All Printers
path = /var/tmp
printable = Yes
create mask = 0600
browseable = No
[netlogon]
comment = netlogon service
path = /var/lib/samba/netlogon
browseable = no
guest ok = . Continue
---8<---8<


Thanks for your interest,
Martín





> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf
> Of
> [EMAIL PROTECTED]
> Sent: Friday, October 28, 2005 12:48 PM
> To: [EMAIL PROTECTED]
> Cc: samba@lists.samba.org
> Subject: Re: [Samba] Overloaded samba server. Is it a bug?
> 
> First of all, why run SuSe when CentOS is free, runs faster and is
> more up
> to date? I have basically the same setup you have except our system is
> a
> quad xeon system and CentOS runs flawlessly 24/7. We used to
> experiment
> with SuSe but it is not good for a corporate environment.  
> Just a heads up as I have been doing this for 17 years and CentOS is
> the
> cream of the crop for the money.
> 
> Martin Scandroli wrote:
> 
> >Experts,
> >
> >We've just migrated from samba 2.2.8a to samba 3.0.20b in a very
> >large
> >corporate environment. Everything was really fine in our lab, but we 
> >began experiment serious load problems on the productive servers the 
> >morning after the procedure took place. I'll try (briefly) to
> >describe
> >the characteristics of the scenario:
> >
> >Resources:
> >
> >Old Environment:
> >
> >Hardware:
> >Dell PowerEdge 2650
> >Intel Xeon Processor
> >2 GB Ram
> 

RE: [Samba] Overloaded samba server. Is it a bug?

2005-10-28 Thread MJBarber
I am running Suse 9.2 Pro in a corporate environment with 3.0.14a and it
works great.
CentOS is nice as well but I see no problem with Suse.

Just my 0.02...  

If you truly think this is a samba problem try a different version to
either replicate the issue or to have it point to a different piece of the
puzzle.  What is your complete config? 

You said the load went sky high in a matter of seconds...do you see which
process is running wild (smbd, nmbd, winbindd...).


Good luck,
Michael Barber
WPTZ/WNNE 
Computer Services Administrator.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Friday, October 28, 2005 12:48 PM
To: [EMAIL PROTECTED]
Cc: samba@lists.samba.org
Subject: Re: [Samba] Overloaded samba server. Is it a bug?

First of all, why run SuSe when CentOS is free, runs faster and is more up
to date?  I have basically the same setup you have except our system is a
quad xeon system and CentOS runs flawlessly 24/7.  We used to experiment
with SuSe but it is not good for a corporate environment.  
Just a heads up as I have been doing this for 17 years and CentOS is the
cream of the crop for the money.

Martin Scandroli wrote:

>Experts,
>
>We've just migrated from samba 2.2.8a to samba 3.0.20b in a very large 
>corporate environment. Everything was really fine in our lab, but we 
>began experiment serious load problems on the productive servers the 
>morning after the procedure took place. I'll try (briefly) to describe 
>the characteristics of the scenario:
>
>Resources:
>
>Old Environment:
>
>Hardware:
>Dell PowerEdge 2650
>Intel Xeon Processor
>2 GB Ram
>Raid 5 (via perc raid controller) on 10k scsi disks
>Software:
>SuSE Linux Enterprise Server 8
>Samba 2.2.8a Servers
>cups printing service
>openldap2 as backend (with replicas all over the country, about 3000 
>objects in the tree)
>HeartBeat as high availability Service
>
>Everything was charming here!!
>
>
>New Environment
>
>Hardware:
>Dell PowerEdge 2850 Servers
>2 Intel Xeon 3.2 GHz (HT i think... i see 4 of them) Processors
>4 GB Ram
>Raid 5 (via Perc raid controller) on 15k scsi disks
>
>Software
>SuSE Linux Enterprise Server 9
>Samba 3.0.20b Servers
>cups printing service
>Novell eDirectory 8.7.3.4 as backend (Very distributed too, about 4000 
>objects in the tree)
>HeartBeat as high availability Service drbd to keep 
>samba configuracion replicated among the cluster nodes.
>
>Problems we're having (or had, just as a usefull comment):
>
>eDirectory turned out to be much slower than openldap2 when responding 
>to nss_ldap queries (i mean about 7 or 8 times slower) so 
>queries asking for members of large groups (i.e: groups with about 1500 
>users and
>above) were usually terminated with an RPC timeout
>
>Everything started to work when we added the ldapsam:trusted=yes 
>parameter. It dramatically reduced the response times and affected 
>queries began to work.
>The implementation of this feature produced some other problems (we've 
>found workarrounds but i'll comment them just to provide some feedback).
>
>1) The samba server used to die seconds after it was started. 
>Something about the nobody user and it's primary group prevented it 
>from working in a proper manner. We solved this inconvinient by adding 
>de user nobody and it's corresponding primary group to the backend.
>2) Root user was no longer recognized, (we still trying to figure out 
>why, the user's been added to the tree, but nothing changed) so we used 
>the new role based administration provided by samba 3 as a workarround 
>(SeMachinAccount...), and no more troubles about it.
>
>
>
>3)THIS ISSUE IS KILLING US!!!
>
>Something happens in a determined moment of the day (rush hour).
>Everything is running smoothly (0.3 - 0.4 of load average) when the 
>load start to grow indefinitely!!. It raises from 0.3 to 50 in a 
>matter of seconds!, and it keeps growing till the server dies. We 
>couldn't find the reason of this, but it happens in a two hors 
>interval. Before and after this interval, there are no errors of any 
>kind.
>
>I'll paste some log errors (just the ones i saw). I don't think 
>they're the cause of our problems, buy you're the experts.
>
>Any clue? do you need me to gather some kind of information? any DoS 
>bug reported for this samba version?
>
>Any help wil

RE: [Samba] Overloaded samba server. Is it a bug?

2005-10-28 Thread Paul Gienger
> First of all, why run SuSe when CentOS is free, runs faster 
> and is more 



This is the samba list and he was asking for samba help, not for a
suggestion that he should change his, possibly corporately mandated,
platform choice .  Regardless of your personal or tested *opinions*, it was
not asked for here.  People have reasons for running what they do, some of
which are out of their control.  

By the way, your Mozilla install is horribly out of date.


-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/listinfo/samba


Re: [Samba] Overloaded samba server. Is it a bug?

2005-10-28 Thread Merle Reine
First of all, why run SuSe when CentOS is free, runs faster and is more 
up to date?  I have basically the same setup you have except our system 
is a quad xeon system and CentOS runs flawlessly 24/7.  We used to 
experiment with SuSe but it is not good for a corporate environment.  
Just a heads up as I have been doing this for 17 years and CentOS is the 
cream of the crop for the money.


Martin Scandroli wrote:


Experts,

We've just migrated from samba 2.2.8a to samba 3.0.20b in a very large
corporate environment. Everything was really fine in our lab, but we
began
experiment serious load problems on the productive servers the morning
after
the procedure took place. I'll try (briefly) to describe the
characteristics
of the scenario:

Resources:

Old Environment:

   Hardware:
   Dell PowerEdge 2650
   Intel Xeon Processor
   2 GB Ram
Raid 5 (via perc raid controller) on 10k scsi disks
   Software:
   SuSE Linux Enterprise Server 8
   Samba 2.2.8a Servers
   cups printing service
openldap2 as backend (with replicas all over the country,
about 3000 objects in the tree)
   HeartBeat as high availability Service

Everything was charming here!!


New Environment

   Hardware:
   Dell PowerEdge 2850 Servers
2 Intel Xeon 3.2 GHz (HT i think... i see 4 of them)
Processors
   4 GB Ram
Raid 5 (via Perc raid controller) on 15k scsi disks

   Software
   SuSE Linux Enterprise Server 9
   Samba 3.0.20b Servers
   cups printing service
Novell eDirectory 8.7.3.4 as backend (Very distributed too,
about 4000 objects in the tree)
   HeartBeat as high availability Service
drbd to keep samba configuracion replicated among the cluster
nodes.

Problems we're having (or had, just as a usefull comment):

eDirectory turned out to be much slower than openldap2 when responding
to nss_ldap queries (i mean about 7 or 8 times slower) so
queries
asking for members of large groups (i.e: groups with about 1500 users
and
above) were usually terminated with an RPC timeout

Everything started to work when we added the ldapsam:trusted=yes
parameter. It dramatically reduced the response times and affected
queries
began to work.
The implementation of this feature produced some other problems (we've
found workarrounds but i'll comment them just to provide some feedback).

   1) The samba server used to die seconds after it was started. 
Something about the nobody user and it's primary group prevented it from

working in a proper manner. We solved this inconvinient by adding de
user
nobody and it's corresponding primary group to the backend.
2) Root user was no longer recognized, (we still trying to figure out
why, the user's been added to the tree, but nothing changed) so we used
the
new role based administration provided by samba 3 as a workarround 
(SeMachinAccount...), and no more troubles about it.




   3)THIS ISSUE IS KILLING US!!!

Something happens in a determined moment of the day (rush hour).
Everything is running smoothly (0.3 - 0.4 of load average) when the load
start to grow indefinitely!!. It raises from 0.3 to 50 in a matter
of
seconds!, and it keeps growing till the server dies. We couldn't find
the
reason of this, but it happens in a two hors interval. Before and after
this
interval, there are no errors of any kind.

   I'll paste some log errors (just the ones i saw). I don't think 
they're the cause of our problems, buy you're the experts.


Any clue? do you need me to gather some kind of information? any DoS
bug reported for this samba version?

   Any help will be highly appreciated

Regards, 
Martin


--

   from /var/log/messages

   Oct 25 04:34:15 srvsmb01 smbd[2961]: [2005/10/25 04:34:15, 0] 
lib/util_sock.c:send_smb(762)
   Oct 25 04:34:15 srvsmb01 smbd[2961]:   Error writing 4 bytes to 
client. -1. (Connection reset by peer)
   Oct 25 04:40:36 srvsmb01 smbd[2983]: [2005/10/25 04:40:36, 0] 
lib/util_sock.c:get_peer_addr(1222)

Oct 25 04:40:36 srvsmb01 smbd[2983]: getpeername failed. Error was
Transport endpoint is not connected
   Oct 25 04:40:36 srvsmb01 smbd[2983]: [2005/10/25 04:40:36, 0] 
lib/util_sock.c:write_data(554)

Oct 25 04:40:36 srvsmb01 smbd[2983]: write_data: write failure in
writing to client 167.252.104.98. Error Connection reset
   by peer

   (this happens very often)

   From /var/log/samba/log.nmbd

tdb(unnamed): tdb_open_ex: /var/lib/samba/unexpected.tdb (2059,2959)
is already open in this process
   [2005/10/26 04:17:01, 2] tdb/tdbutil.c:tdb_log(767)
tdb(unnamed): tdb_open_ex: /var/lib/samba/unexpected.tdb (2059,2959)
is already open in this process
   [2005/10/26 04:17:01, 2] tdb/tdbutil.c:tdb_log(767)
tdb(unnamed): tdb_open_ex: /var/lib/samba/unexpected.tdb (2059,2959)
is already open in this process
   [2005/10/26 04:1