Re: tcp reset errors

2015-04-13 Thread Steve
Willy Tarreau w at 1wt.eu writes:

 
 Hi Franky,
 
 On Thu, Sep 11, 2014 at 01:08:09PM +0200, Franky Van Liedekerke wrote:
  On Thu, Sep 11, 2014 at 11:40 AM, Franky Van Liedekerke
  liedekef@... wrote:
   After doing tcpdump on both servers (no ldap errors anywhere in the
   ldap logs), I see that the ldap server sends out resets and the
   clients connecting to haproxy. This might be related to one another.
   Each client seems to send 2 RST packets at the end of a LDAP TLS
   session (over port 389), does that sound familiar?
  
   Franky
  
  Ok, after much trial and error, I pinned it down to the following: we
  have lots of servers doing ldap lookup for authentication, also when
  connecting via ssh. Now on EL5 servers this auth is done via a call to
  /usr/libexec/openssh/ssh-ldap-wrapper.
  Apparently this binary causes the resets to be shown in the haproxy
  error logs. I switched to the sssd version for EL5 servers, but that
  version did not include ssh-keys support, so the resets persisted.
  Again to the internet for the rescue: the version 1.9.6 for el5 can be
  found at
http://copr-be.cloud.fedoraproject.org/results/sgallagh/sssd-1.9-rhel5/epel-5-x86_64
  , and that version does support ssh correctly. Installing it, changing
  the ssh config et voila: no more resets.
  So the bug is in the ssh-ldap-wrapper, but I understand that doing a
  RST at the end is not bad, just not good either ... the side-effect
  of the new sssd is that much less ldap queries are made (as sudo and
  ssh use sssd too then), but I'll leave it up to the management to
  decide wether or not to go for that solution.
 
 Thanks for sending the details of your diagnostic. As you say, RST are
 not necessarily bad. When a client closes first, it has two options :
   - either send RST
   - or have the source port unusable for 2 minutes.
 
 Most of the time you chose the first option. In your case since you were
 seeing SD flags, it means the reset came fro mthe server, maybe the client
 was speaking inappropriately on the connection, causing the server to
 abort it. If so, it proves that the behaviour was properly chosen, because
 it allowed you to detect the anomaly in the logs and to fix it, which is
 quite good.
 
 Regards,
 Willy
 
 

I have have been having a similar issue with RST on LDAP connection but have
not been able to pin it down any further than haproxy. LDAP seems to be the
only issue at the moment although a had identical symptoms with SMTP that
were resolved after I lowered the MTU on the interface.

I have a mail appliance on one side and AD LDAP on the other of this haproxy:
spam_filter -  haproxy(service) --- AD LDAP
When I run an LDAP test, it doesn't seem to matter if it is plain text or
SSL, I will randomly get a RST sent by the server during transfer. I have
run the LDAP test dozens of times and there is no patter, randomly about 50%
fail mid stream and a packet capture shows this. RST from server.

To further test I routed the same test through the haproxy machine,
ip_forwarding, to the same destination as the haproxy backend and all tests
succeeded 100%.
spam_filter -- haproxy(ip_forward) - AD LDAP

Does anyone have any advise on what to check next?

My haproxy.cfg only has defined mode tcp, do I require other options for LDAP?

I need LDAPS and have found that the option ldap-check does not work, but
LDAP vs LDAPS does not affect my problem.

Thanks
Steve





Re: tcp reset errors

2014-09-11 Thread Franky Van Liedekerke
After doing tcpdump on both servers (no ldap errors anywhere in the
ldap logs), I see that the ldap server sends out resets and the
clients connecting to haproxy. This might be related to one another.
Each client seems to send 2 RST packets at the end of a LDAP TLS
session (over port 389), does that sound familiar?

Franky

On Wed, Sep 10, 2014 at 8:49 PM, Pavlos Parissis
pavlos.paris...@gmail.com wrote:
 On 10/09/2014 03:31 μμ, Franky Van Liedekerke wrote:
 Hi,


 [..snip..]

 Any hints are very much appreciated. If more info is needed, let me know.



 Is it possible to run tcpdump on both servers and see who is sending
 RSTs? what about ldap logs? Do you know if you get this problem for all
 LDAP queries or for a subset? It could be that LDAP queries take too
 much time to be processed on LDAP due to missing index, heavy IO and
 etc. I know ldap can provide quite a lot of information.

 Cheers,
 Pavlos






Re: tcp reset errors

2014-09-11 Thread Franky Van Liedekerke
On Thu, Sep 11, 2014 at 11:40 AM, Franky Van Liedekerke
liede...@telenet.be wrote:
 After doing tcpdump on both servers (no ldap errors anywhere in the
 ldap logs), I see that the ldap server sends out resets and the
 clients connecting to haproxy. This might be related to one another.
 Each client seems to send 2 RST packets at the end of a LDAP TLS
 session (over port 389), does that sound familiar?

 Franky


(btw, sorry for top-posting before)

Franky



Re: tcp reset errors

2014-09-11 Thread Franky Van Liedekerke
On Thu, Sep 11, 2014 at 11:40 AM, Franky Van Liedekerke
liede...@telenet.be wrote:
 After doing tcpdump on both servers (no ldap errors anywhere in the
 ldap logs), I see that the ldap server sends out resets and the
 clients connecting to haproxy. This might be related to one another.
 Each client seems to send 2 RST packets at the end of a LDAP TLS
 session (over port 389), does that sound familiar?

 Franky

Ok, after much trial and error, I pinned it down to the following: we
have lots of servers doing ldap lookup for authentication, also when
connecting via ssh. Now on EL5 servers this auth is done via a call to
/usr/libexec/openssh/ssh-ldap-wrapper.
Apparently this binary causes the resets to be shown in the haproxy
error logs. I switched to the sssd version for EL5 servers, but that
version did not include ssh-keys support, so the resets persisted.
Again to the internet for the rescue: the version 1.9.6 for el5 can be
found at 
http://copr-be.cloud.fedoraproject.org/results/sgallagh/sssd-1.9-rhel5/epel-5-x86_64
, and that version does support ssh correctly. Installing it, changing
the ssh config et voila: no more resets.
So the bug is in the ssh-ldap-wrapper, but I understand that doing a
RST at the end is not bad, just not good either ... the side-effect
of the new sssd is that much less ldap queries are made (as sudo and
ssh use sssd too then), but I'll leave it up to the management to
decide wether or not to go for that solution.

Franky



Re: tcp reset errors

2014-09-11 Thread Willy Tarreau
Hi Franky,

On Thu, Sep 11, 2014 at 01:08:09PM +0200, Franky Van Liedekerke wrote:
 On Thu, Sep 11, 2014 at 11:40 AM, Franky Van Liedekerke
 liede...@telenet.be wrote:
  After doing tcpdump on both servers (no ldap errors anywhere in the
  ldap logs), I see that the ldap server sends out resets and the
  clients connecting to haproxy. This might be related to one another.
  Each client seems to send 2 RST packets at the end of a LDAP TLS
  session (over port 389), does that sound familiar?
 
  Franky
 
 Ok, after much trial and error, I pinned it down to the following: we
 have lots of servers doing ldap lookup for authentication, also when
 connecting via ssh. Now on EL5 servers this auth is done via a call to
 /usr/libexec/openssh/ssh-ldap-wrapper.
 Apparently this binary causes the resets to be shown in the haproxy
 error logs. I switched to the sssd version for EL5 servers, but that
 version did not include ssh-keys support, so the resets persisted.
 Again to the internet for the rescue: the version 1.9.6 for el5 can be
 found at 
 http://copr-be.cloud.fedoraproject.org/results/sgallagh/sssd-1.9-rhel5/epel-5-x86_64
 , and that version does support ssh correctly. Installing it, changing
 the ssh config et voila: no more resets.
 So the bug is in the ssh-ldap-wrapper, but I understand that doing a
 RST at the end is not bad, just not good either ... the side-effect
 of the new sssd is that much less ldap queries are made (as sudo and
 ssh use sssd too then), but I'll leave it up to the management to
 decide wether or not to go for that solution.

Thanks for sending the details of your diagnostic. As you say, RST are
not necessarily bad. When a client closes first, it has two options :
  - either send RST
  - or have the source port unusable for 2 minutes.

Most of the time you chose the first option. In your case since you were
seeing SD flags, it means the reset came fro mthe server, maybe the client
was speaking inappropriately on the connection, causing the server to
abort it. If so, it proves that the behaviour was properly chosen, because
it allowed you to detect the anomaly in the logs and to fix it, which is
quite good.

Regards,
Willy




Re: tcp reset errors

2014-09-10 Thread Pavlos Parissis
On 10/09/2014 03:31 μμ, Franky Van Liedekerke wrote:
 Hi,
 
 
[..snip..]

 Any hints are very much appreciated. If more info is needed, let me know.
 


Is it possible to run tcpdump on both servers and see who is sending
RSTs? what about ldap logs? Do you know if you get this problem for all
LDAP queries or for a subset? It could be that LDAP queries take too
much time to be processed on LDAP due to missing index, heavy IO and
etc. I know ldap can provide quite a lot of information.

Cheers,
Pavlos





signature.asc
Description: OpenPGP digital signature