Re: Debugging dropped shell connections over a VPN

2011-07-26 Thread Paul Keusemann

On 07/26/11 08:05, Gary Palmer wrote:

On Tue, Jul 26, 2011 at 06:53:59AM -0500, Paul Keusemann wrote:

Again, sorry for the sluggish response.

On 07/20/11 15:15, Gary Palmer wrote:

On Tue, Jul 12, 2011 at 02:26:34PM -0500, Paul Keusemann wrote:

On 07/07/11 14:39, Chuck Swiger wrote:

On Jul 7, 2011, at 4:45 AM, Paul Keusemann wrote:

My setup is something like this:
- My local network is a mix of AIX, HP-UX, Linux, FreeBSD and Solaris
machines running various OS versions.
- My gateway / firewall  machine is running FreeBSD-8.1-RELEASE-p1 with
ipfw, nat and racoon for the firewall and VPN.

The problem is that rlogin, ssh and telnet connections over the VPN get
dropped after some period of inactivity.

You're probably getting NAT timeouts against the VPN connection if it is
left idle.  racoon ought to have a config setting called natt_keepalive
which sends periodic keepalives-- see whether that's disabled.

Regards,

Thanks for the suggestions Chuck, sorry it's taken so long to respond
but I had to reconfigure and rebuild my kernel to enable IPSEC_NAT_T in
order to try this out.

One thing that I did not explicitly mention before is that I am routing
a network over the VPN.

Hi Paul,

Even if you are not being NAT'd on the VPN there may be a firewall (or
other active network component like a load balancer) with an
overflowing state table somewhere at the remote end.  We see this
frequently where I work with customer networks and the firewall/VPN/network
admin denies that its a time out issue so there is likely some device in
the network that has a state table and if the connection is idle for a
few minutes it gets dropped.

Hmmm,  this seems likely.  Have you had any luck in finding the culprit
and resolving the problem?

Unfortunately no.  We know the problem exists but as a vendor we have
very little success in getting the customer to identify the problematic
device inside their network as it only seems to affect our connections
to them when we are helping them with problems, so there is almost
always something more important going on and the timeout issue gets put
on the back burner and forgotten.  We've worked around it in some
places by using the ssh 'ServerAliveInterval' directive to make ssh
send packets and keep the session open even if we're idle, but that
doesn't always work.


OK, I found the ClientAliveInterval, and ClientAliveCountMax setting in 
the ssh_config man page.  I assume these are what you are referring to.  
I tried setting ClientAliveInterval to 15 seconds with 
ClientAliveCountMax set to 3 and this seems to help.  I've only tried 
this a couple of times but I have seen an ssh session stay alive for 
over an hour.  The bad news is that the sessions are still getting 
dropped, at least now I know when it happens.  Now I'm getting the 
following message:


Received disconnect from 10.64.20.69: 2: Timeout, your session not 
responding.


From a quick perusal of the openssh source, it is not obvious whether 
this message is coming from the client or the server side.   Initially, 
because the keep alive timer is a server side setting, I assumed the 
message was coming from the server side but if the session is not 
responding how is the message getting to the client?  If it is a client 
side problem, then I have much more flexibility to fix.  All I can do is 
whine about server side problems.


Paul



Gary
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"




--
Paul Keusemannpkeu...@visi.com
4266 Joppa Court  (952) 894-7805
Savage, MN  55378

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/156978: [lagg][patch] Take lagg rlock before checking flags

2011-07-26 Thread maxim
Synopsis: [lagg][patch] Take lagg rlock before checking flags

State-Changed-From-To: open->patched
State-Changed-By: maxim
State-Changed-When: Tue Jul 26 14:52:18 UTC 2011
State-Changed-Why: 
thompsa@ has committed the patch to HEAD in r223846.

http://www.freebsd.org/cgi/query-pr.cgi?pr=156978
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: m_pkthdr.rcvif dangling pointer problem

2011-07-26 Thread Luigi Rizzo
On Tue, Jul 26, 2011 at 10:09:09AM +0100, Robert N. M. Watson wrote:
> 
> On 25 Jul 2011, at 12:00, Daan Vreeken wrote:
> 
> > Couldn't the dangling pointer problem be solved by adding a 'generation' 
> > field 
> > to the mbuf structure?
> > The 'generation' could be a system-wide number that gets incremented 
> > whenever 
> > an interface is removed. The mbuf* functions could keep a (per CPU?) 
> > reference count on the number of mbufs allocated/freed during 
> > that 'generation'. After interface removal, the ifnet structure could be 
> > freed when all the reference counters of generations before the current 
> > generation reach zero (whenever that happens).
> 
> I think a hybrid approach makes sense, combining a number of the ideas we've 
> been kicking about:
> 
> (1) Add per-CPU ifnet refcounts that don't imply cache-line misses on each 
> mbuf alloc/free
> (2) Add optional subsystem drain functions so that subsystems that may have 
> unbounded queueing times for mbufs deterministically ensure reference 
> release, perhaps by substituting a common deadif for outstanding dying 
> references.
> 
> The former gives us actual correctness in terms of avoiding races, the latter 
> gives us deterministic freeing by subsystems that potentially queue mbufs 
> forever (i.e., TCP) but no longer require the ifnet reference.

I'd like to suggest that before doing all this work we could try
and see which subsystems have a real need to de-reference the
reference, which fields they use, and how often.

Because maybe just copying into the mbuf a blob of 8-16 bytes with
useful info (a cookie, fib index, some flags, etc) could perhaps cover the
majority of cases (in terms of usage frequency, not locations in the code)
and let us deal with other cases by looking up the cookie in some
data structure.

As an example:
- some functions just use rcvif to tell whether this is an incoming
  packet. No actual dereference;
- others might only care that rcvif equals some other (already refcounted)
  value, so we don't have a race there.

cheers
luigi
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Debugging dropped shell connections over a VPN

2011-07-26 Thread Gary Palmer
On Tue, Jul 26, 2011 at 06:53:59AM -0500, Paul Keusemann wrote:
> Again, sorry for the sluggish response.
> 
> On 07/20/11 15:15, Gary Palmer wrote:
> >On Tue, Jul 12, 2011 at 02:26:34PM -0500, Paul Keusemann wrote:
> >>On 07/07/11 14:39, Chuck Swiger wrote:
> >>>On Jul 7, 2011, at 4:45 AM, Paul Keusemann wrote:
> My setup is something like this:
> - My local network is a mix of AIX, HP-UX, Linux, FreeBSD and Solaris
> machines running various OS versions.
> - My gateway / firewall  machine is running FreeBSD-8.1-RELEASE-p1 with
> ipfw, nat and racoon for the firewall and VPN.
> 
> The problem is that rlogin, ssh and telnet connections over the VPN get
> dropped after some period of inactivity.
> >>>You're probably getting NAT timeouts against the VPN connection if it is
> >>>left idle.  racoon ought to have a config setting called natt_keepalive
> >>>which sends periodic keepalives-- see whether that's disabled.
> >>>
> >>>Regards,
> >>Thanks for the suggestions Chuck, sorry it's taken so long to respond
> >>but I had to reconfigure and rebuild my kernel to enable IPSEC_NAT_T in
> >>order to try this out.
> >>
> >>One thing that I did not explicitly mention before is that I am routing
> >>a network over the VPN.
> >Hi Paul,
> >
> >Even if you are not being NAT'd on the VPN there may be a firewall (or
> >other active network component like a load balancer) with an
> >overflowing state table somewhere at the remote end.  We see this
> >frequently where I work with customer networks and the firewall/VPN/network
> >admin denies that its a time out issue so there is likely some device in
> >the network that has a state table and if the connection is idle for a
> >few minutes it gets dropped.
> 
> Hmmm,  this seems likely.  Have you had any luck in finding the culprit 
> and resolving the problem?

Unfortunately no.  We know the problem exists but as a vendor we have
very little success in getting the customer to identify the problematic
device inside their network as it only seems to affect our connections
to them when we are helping them with problems, so there is almost
always something more important going on and the timeout issue gets put
on the back burner and forgotten.  We've worked around it in some
places by using the ssh 'ServerAliveInterval' directive to make ssh
send packets and keep the session open even if we're idle, but that
doesn't always work.

Gary
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Debugging dropped shell connections over a VPN

2011-07-26 Thread Paul Keusemann

Again, sorry for the sluggish response.

On 07/20/11 15:15, Gary Palmer wrote:

On Tue, Jul 12, 2011 at 02:26:34PM -0500, Paul Keusemann wrote:

On 07/07/11 14:39, Chuck Swiger wrote:

On Jul 7, 2011, at 4:45 AM, Paul Keusemann wrote:

My setup is something like this:
- My local network is a mix of AIX, HP-UX, Linux, FreeBSD and Solaris
machines running various OS versions.
- My gateway / firewall  machine is running FreeBSD-8.1-RELEASE-p1 with
ipfw, nat and racoon for the firewall and VPN.

The problem is that rlogin, ssh and telnet connections over the VPN get
dropped after some period of inactivity.

You're probably getting NAT timeouts against the VPN connection if it is
left idle.  racoon ought to have a config setting called natt_keepalive
which sends periodic keepalives-- see whether that's disabled.

Regards,

Thanks for the suggestions Chuck, sorry it's taken so long to respond
but I had to reconfigure and rebuild my kernel to enable IPSEC_NAT_T in
order to try this out.

One thing that I did not explicitly mention before is that I am routing
a network over the VPN.

Hi Paul,

Even if you are not being NAT'd on the VPN there may be a firewall (or
other active network component like a load balancer) with an
overflowing state table somewhere at the remote end.  We see this
frequently where I work with customer networks and the firewall/VPN/network
admin denies that its a time out issue so there is likely some device in
the network that has a state table and if the connection is idle for a
few minutes it gets dropped.


Hmmm,  this seems likely.  Have you had any luck in finding the culprit 
and resolving the problem?




Regards,

Gary
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"




--
Paul Keusemannpkeu...@visi.com
4266 Joppa Court  (952) 894-7805
Savage, MN  55378

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Debugging dropped shell connections over a VPN

2011-07-26 Thread Paul Keusemann
Once again, apologies for my sluggish response.  The VPN problem is a 
background job worked on when I can or when I'm too annoyed by it to do 
anything else.


On 07/12/11 17:42, Chuck Swiger wrote:

On Jul 12, 2011, at 12:26 PM, Paul Keusemann wrote:

So, any other ideas on how to debug this?

Gather data with tcpdump.  If you do it on one of the VPN endpoints, you ought 
to see the VPN contents rather than just packets going by in the encrypted 
tunnel.



I assume by endpoint, you are talking about the target of the remote 
shell.  Unfortunately, running tcpdump on the endpoint shows only the 
initial negotiation (and any interactive keyboard traffic) but nothing 
to indicate the connection has been dropped or timed out.


If I can get some time when I don't actually need to use the VPN for 
work, I'm going to try to run tcpdump on the tunnel to see if there's 
anything going across it that might shed some light on the cause of the 
dropped connections.



Anybody know how to get racoon to log everything to one file?  Right now, 
depending on the log level, I am getting messages in racoon.log (specified with 
-l at startup), messages and debug.log.  It would really be nice to have just 
one log to look at.

This is likely governed by /etc/syslog.conf, but if you specify -l then racoon 
shouldn't use syslog logging.


My syslog.conf foo is not good but it seems that some stuff  from racoon 
always ends up in the messages file, even when the -l option to racoon 
is specified.


Thanks again for the tips.

--
Paul Keusemannpkeu...@visi.com
4266 Joppa Court  (952) 894-7805
Savage, MN  55378

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: m_pkthdr.rcvif dangling pointer problem

2011-07-26 Thread Robert N. M. Watson

On 25 Jul 2011, at 12:00, Daan Vreeken wrote:

> Couldn't the dangling pointer problem be solved by adding a 'generation' 
> field 
> to the mbuf structure?
> The 'generation' could be a system-wide number that gets incremented whenever 
> an interface is removed. The mbuf* functions could keep a (per CPU?) 
> reference count on the number of mbufs allocated/freed during 
> that 'generation'. After interface removal, the ifnet structure could be 
> freed when all the reference counters of generations before the current 
> generation reach zero (whenever that happens).

I think a hybrid approach makes sense, combining a number of the ideas we've 
been kicking about:

(1) Add per-CPU ifnet refcounts that don't imply cache-line misses on each mbuf 
alloc/free
(2) Add optional subsystem drain functions so that subsystems that may have 
unbounded queueing times for mbufs deterministically ensure reference release, 
perhaps by substituting a common deadif for outstanding dying references.

The former gives us actual correctness in terms of avoiding races, the latter 
gives us deterministic freeing by subsystems that potentially queue mbufs 
forever (i.e., TCP) but no longer require the ifnet reference.

Robert___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: What does define COMMENT_ONLY mean?

2011-07-26 Thread Vladimir Budnev

On 07/23/11 04:21, Bruce Evans wrote:


C didn't support support variable-sized structs before C99, and
doesn't really support them now.  Various hacks are used to make
pseudo-structs larger or smaller than ones that can actually be
declared work.  The above is one.  The pseudo-struct is malloc()ed
and has size larger than the declared one.  The above says what
would be in it if it could be declared.

If this were written in C99, it might declare u_char ar_foo[] in the
the code instead of in a comment.  But C can't really support variable-
sized structs.  It only allows one ar_foo[], which must be at the end
of the struct.  ar_foo then has a known offset but an unknown size.
The other ar_bar[]'s still can't be declared, since they want to be
further beyond the end of the struct, which places them at an unknown
offset.

A probably-less-unportable way was to declare everything in the struct
but malloc() less.  This only works if all the magic fields are at
known offsets.  This doesn't work in the above, since the fields want
to have variable lengths each and thus end up at variable offsets.
Such fields can be allocated from a single larger field (usually an
an array), but you lose the possibility of declaring them all.

Bruce


I got the idea with "dynamic size", tnx:) But comment_only ...ah 
nevermined. Tnx for explanation, Bruce.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"