Re: kern/127050: [carp] ipv6 does not work on carp interfaces [regression]

2011-08-22 Thread Paul Herman

On 8/21/2011 1:47 AM, Ask Bjørn Hansen wrote:


On Aug 19, 2011, at 1:30, Paul Herman wrote:


--010305010708060807000808
Content-Type: application/gzip;
  name=carp_ip6_alias.patch.gz
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
  filename=carp_ip6_alias.patch.gz


I wanted to try it, but gzip doesn't seem to like that file …

(downloaded from http://www.freebsd.org/cgi/query-pr.cgi?pr=127050cat= )


It's base64 encoded of course -- works for me when I pipe the text into
  openssl base64 -d | zcat

(zipped to preserve white spacing) For those craving instant 
satisfaction, here it is in plain text.


-Paul.

--- sys/netinet/ip_carp.c.orig  2011-08-19 07:52:56.0 +
+++ sys/netinet/ip_carp.c 2011-08-19 07:15:03.0 +
@@ -1670,9 +1670,11 @@
struct carp_if *cif;
struct in6_ifaddr *ia, *ia_if;
struct ip6_moptions *im6o = sc-sc_im6o;
+   struct in6_multi *in6m;
struct in6_addr in6;
int own, error;

+
error = 0;

if (IN6_IS_ADDR_UNSPECIFIED(sin6-sin6_addr)) {
@@ -1729,8 +1731,6 @@
}

if (!sc-sc_naddrs6) {
-  struct in6_multi *in6m;
-
   im6o-im6o_multicast_ifp = ifp;

   /* join CARP multicast address */
@@ -1745,24 +1745,24 @@
goto cleanup;
   im6o-im6o_membership[0] = in6m;
   im6o-im6o_num_memberships++;
-
-  /* join solicited multicast address */
-  bzero(in6, sizeof(in6));
-  in6.s6_addr16[0] = htons(0xff02);
-  in6.s6_addr32[1] = 0;
-  in6.s6_addr32[2] = htonl(1);
-  in6.s6_addr32[3] = sin6-sin6_addr.s6_addr32[3];
-  in6.s6_addr8[12] = 0xff;
-  if (in6_setscope(in6, ifp, NULL) != 0)
-   goto cleanup;
-  in6m = NULL;
-  error = in6_mc_join(ifp, in6, NULL, in6m, 0);
-  if (error)
-   goto cleanup;
-  im6o-im6o_membership[1] = in6m;
-  im6o-im6o_num_memberships++;
}

+   /* join solicited multicast address */
+   bzero(in6, sizeof(in6));
+   in6.s6_addr16[0] = htons(0xff02);
+   in6.s6_addr32[1] = 0;
+   in6.s6_addr32[2] = htonl(1);
+   in6.s6_addr32[3] = sin6-sin6_addr.s6_addr32[3];
+   in6.s6_addr8[12] = 0xff;
+   if (in6_setscope(in6, ifp, NULL) != 0)
+  goto cleanup;
+   in6m = NULL;
+   error = in6_mc_join(ifp, in6, NULL, in6m, 0);
+   if (error)
+  goto cleanup;
+   im6o-im6o_membership[1] = in6m;
+   im6o-im6o_num_memberships++;
+
if (!ifp-if_carp) {
   cif = malloc(sizeof(*cif), M_CARP,
   M_WAITOK|M_ZERO);
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: kern/127050: [carp] ipv6 does not work on carp interfaces [regression]

2011-08-19 Thread Paul Herman
The following reply was made to PR kern/127050; it has been noted by GNATS.

From: Paul Herman pher...@frenchfries.net
To: bug-follo...@freebsd.org
Cc: Wouter de Jong maddo...@maddog2k.net, Jacek Zapala ja...@it.pl
Subject: Re: kern/127050: [carp] ipv6 does not work on carp interfaces 
[regression]
Date: Fri, 19 Aug 2011 10:13:46 +0200

 This is a multi-part message in MIME format.
 --010305010708060807000808
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit
 
 This one's been bothering me too, so I setup a test machine yesterday to 
 figure out what's going on.  I see two problems here.
 
 First of all, in6_ifinit() (called via in6_control() SIOCAIFADDR_IN6 - 
 in6_update_ifa()) only calls carp_ioctl() on the first IPv6 address. 
 Whereas in the v4 case, carp_ioctl() does get called by in_ifinit() 
 every time.
 
 Second of all, carp_set_addr6() (called via carp_ioctl()) only joins the 
 CARP multicast group AND the solicitation group with the first address.
 
 The attached patch against 8-STABLE fixes these issues, alias IPs are 
 now pingable.  I have not tested actual carp functionality with a 2nd 
 BACKUP carp.  What this patch doesn't address is group membership 
 removal when alias IPs are deleted (apparently also broken.)
 
 Try it out, if it works for you guys, maybe someone more familiar with 
 in[6]_control() can chime in here and comment on a *real* way to fix 
 this issue.  :-)
 
 Cheers,
 
 -Paul.
 
 --010305010708060807000808
 Content-Type: application/gzip;
  name=carp_ip6_alias.patch.gz
 Content-Transfer-Encoding: base64
 Content-Disposition: attachment;
  filename=carp_ip6_alias.patch.gz
 
 H4sICDwXTk4CA2NhcnBfaXA2X2FsaWFzLnBhdGNoAJVVa0+jQBT9TH/F9UsDHbDQWvramhof
 CVlfsZpN1hiCCDobYAhD42Pd/753mNIHtlZJyuvee+7pOXcGwzCAv/JmEuQUf3aTJvauv8sy
 +qi0TMsyzJ5h9cHsDszeoG3umuUBRJxrhJA19dXS/qDT/lA6HoNhdffaehfI7Doe16AGCvWM
 feq53sNDBiNocMQcYsBQaAgqDT2fTZMcfozAgnodaJhieuhS5ueRBn9rpMirJL6/zxPz1zSA
 0Qick2v38ODqUluLAooSZBkTDNTGahShUx0mzsXhxDk5ODq60kH1BVs316inDUWtoFDUz7AU
 nkYvKsegsSp5k6au72XpetU7rUHH3qb6AqJSbXUG5gbh7a6p91F4cbWsQnmF59nUz0FAuci/
 4dNwuHiNLuDbwpQG9XRxclcTUtuNWZpTlnCMxjZD7ercN/a574rHIVqzBBZPo5xiYmLHlTZF
 E+m6QtFB9pzoUKgp5oDg29IacygnBtV2zm3XmbjCD/fmfHJ5fOicOMdHal3MD3IogTXhiBy+
 Vl/vieFrW7otJfg3h9uZEU9EDbdFkaFspG8UnuN/xDHBswz6Hs9RohR54lkyVZoN+MNoAmLy
 YJ4GokvAUbdmuTA6emuvWBnFTcFOUR5ZzsCPAi+ZpsNqzyC+DzL+RNNb8070nAm7nJNM46U8
 ToigbixYcRZRn+bBw3pqmHn/FmRMrSO4Dpy+BSxU8V4Tg42dcPVzKbNlSxZPOY6Dar6Eodn6
 kNRu3Vp30sYPgVZZHanWusK2iFe8XQlXa3q3VoEpuMiY2E6wkgc591kazP5VsbrPb05PNdjB
 bE2kVoSXwDGCibziuZzIYjR8V6gp9wkJKvLkfawj5ry/3CM2tFjrrbXw1tjmrRxo8lV7ySfu
 ki+YSzZ4Sz63lnzfWbLZWPJlX0lVc7LqKvm2qWTZ03Xw2xwlWwwl5e5UfpHEZj37xOB2jTix
 F0XMV2fmiT1c0+Gs+NDpIgvwOHN/HTjXFz/fz9zfx1cXSPw/VIIrJwgIAAA=
 --010305010708060807000808--
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Question on TCP segment sizes

2003-09-22 Thread Paul Herman
Hi -net,

While pondering sending large files across a jumbo frame network
using FreeBSD, I decided to first see how well the loopback
interface does.  Using ttcp on the same machine:

  ttcp -s -r 
  ttcp -s -t 127.0.0.1

I noticed that although the MSS is 16k, I don't ever see a full 16k
segment.  In fact, 16k packets in the form of one 2k packet and one
14k packet are sent, as the following tcpdump output shows:

.706018 127.0.0.1.1026  127.0.0.1.5001: S 3254883299:3254883299(0) win 57344 mss 
16344,nop,wscale 0,nop,nop,timestamp 112512 0 (DF)
.706108 127.0.0.1.5001  127.0.0.1.1026: S 2952251081:2952251081(0) ack 3254883300 win 
57344 mss 16344,nop,wscale 0,nop,nop,timestamp 112512 112512 (DF)
.706140 127.0.0.1.1026  127.0.0.1.5001: . ack 1 win 57344 nop,nop,timestamp 112512 
112512 (DF)
.707454 127.0.0.1.1026  127.0.0.1.5001: P 1:8193(8192) ack 1 win 57344 
nop,nop,timestamp 112512 112512 (DF)
.708308 127.0.0.1.1026  127.0.0.1.5001: P 8193:22529(14336) ack 1 win 57344 
nop,nop,timestamp 112512 112512 (DF)
.708346 127.0.0.1.5001  127.0.0.1.1026: . ack 22529 win 43008 nop,nop,timestamp 
112512 112512 (DF)
.708375 127.0.0.1.1026  127.0.0.1.5001: P 22529:24577(2048) ack 1 win 57344 
nop,nop,timestamp 112512 112512 (DF)
.708508 127.0.0.1.1026  127.0.0.1.5001: P 24577:38913(14336) ack 1 win 57344 
nop,nop,timestamp 112512 112512 (DF)
.708530 127.0.0.1.5001  127.0.0.1.1026: . ack 38913 win 43008 nop,nop,timestamp 
112512 112512 (DF)
.708549 127.0.0.1.1026  127.0.0.1.5001: P 38913:40961(2048) ack 1 win 57344 
nop,nop,timestamp 112512 112512 (DF)
.708617 127.0.0.1.1026  127.0.0.1.5001: P 40961:55297(14336) ack 1 win 57344 
nop,nop,timestamp 112512 112512 (DF)
.708638 127.0.0.1.5001  127.0.0.1.1026: . ack 55297 win 43008 nop,nop,timestamp 
112512 112512 (DF)
[...repeats...]

The same happens phenomenon with FTP so I don't think there's any
voodoo going on with ttcp.  Interestingly enough, raising the MTU
on lo0 above 16k changes nothing.

Also, IPv6 (FTP) shows similar behavior:

.449842 ::1.1036  ::1.49153: P 92161:93185(1024) ack 1 win 57344 nop,nop,timestamp 
292774 292774
.449992 ::1.1036  ::1.49153: P 93185:107521(14336) ack 1 win 57344 nop,nop,timestamp 
292774 292774
.450018 ::1.49153  ::1.1036: . ack 107521 win 43008 nop,nop,timestamp 292774 292774 
[flowlabel 0x645a7]
.450099 ::1.1036  ::1.49153: P 107521:108545(1024) ack 1 win 57344 nop,nop,timestamp 
292774 292774
.450250 ::1.1036  ::1.49153: P 108545:122881(14336) ack 1 win 57344 
nop,nop,timestamp 292774 292774
.450275 ::1.49153  ::1.1036: . ack 122881 win 43008 nop,nop,timestamp 292774 292774 
[flowlabel 0x645a7]
.450354 ::1.1036  ::1.49153: P 122881:123905(1024) ack 1 win 57344 nop,nop,timestamp 
292774 292774
.450504 ::1.1036  ::1.49153: P 123905:138241(14336) ack 1 win 57344 
nop,nop,timestamp 292774 292774

So the question is, why don't we get two full sized 16344 (or at
least two 14336) sized segments?  It seems it would be more
efficient that way, no?

4.9-PRERELEASE from yesterday, all sysctls are system defaults...

-Paul.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: delayed ACK

2002-10-15 Thread Paul Herman

On Mon, 14 Oct 2002, Steve Francis wrote:

 Kirill Ponomarew wrote:
 
  is it recommended to use net.inet.tcp.delayed_ack=0 on the machines with
  heavy network traffic ?
 
 If you want to increase your network traffic for no particular reason,
 and increase load on your server, then yes.

 Otherwise no.

Not true.  Although some bugs have been fixed in 4.3, FreeBSD's
delayed ACKs will still degrade your performance dramatically in
some cases.

For now, the best advice I could give is to benchmark your client
machine with and without delayed ACKs and see which works best for
your environment.

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Paul Herman

On Tue, 15 Oct 2002, Lars Eggert wrote:

 Paul Herman wrote:
 
  Not true.  Although some bugs have been fixed in 4.3, FreeBSD's
  delayed ACKs will still degrade your performance dramatically in
  some cases.

 I'm sorry, but such statements without a packet trace that exhibits the
 problem are just not useful.

/me reels line back in

Aha! Another victim who is willing to take a look at this! :-)

It's an issue that was left unresolved in kern/24645.  Bruce Evans
brought this to my attention back during the unrelated I have
delayed ACK problems thread on -net in January of 2001 and I then
passed it on to jlemon.  If you need a packet trace, let me know,
but you should be able to reproduce it yourself.  Even today on my
4.7-PRERELEASE I still get:

  mammoth# sysctl net.inet.tcp.delayed_ack=0
  net.inet.tcp.delayed_ack: 1 - 0
  mammoth# time tar cf 127.0.0.1:/tmp/foo /kernel
  0.000u 0.041s 0:00.33 12.1% 350+300k 0+0io 0pf+0w

  mammoth# sysctl net.inet.tcp.delayed_ack=1
  net.inet.tcp.delayed_ack: 0 - 1
  mammoth# time tar cf 127.0.0.1:/tmp/foo /kernel
  0.014u 0.033s 0:45.90 0.0%  700+600k 0+0io 0pf+0w
^^^

It seems that lowering lo0 mtu to 1500 makes this particular
problem go away.  The magic mtu size is 2100.  This makes me think
that this is a big problem across GigE using 8K jumbo frames, not
sure.  Also, taring over the IPv6 lo0 interface seems to work OK.

No idea what causes this.

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: arp_rtrequest: bad gateway value

2001-11-26 Thread Paul Herman

On Fri, 23 Nov 2001, Paul Herman wrote:

 On Thu, 22 Nov 2001, Ruslan Ermilov wrote:

  On Wed, Nov 21, 2001 at 05:32:27PM -0800, Paul Herman wrote:
   Hi,
  
   I'd like to pick some brains before I file a PR.
  
  There's already a PR open on this, kern/29170.
 
  [...]
 

Here's a patch against 4.4-RELEASE that fixes this problem.  As
mentioned before, the problem happens when a gateway with the
RTF_LLINFO set gets polluted with non-link information.  routed and
route are both culprits.  BTW, KAME does this as well by putting
AF_INET6 data into gateways with the RTF_LLINFO flag set, which I
don't think is a good idea, but it calls rt_setgate() directly and
isn't affected by this patch.

I've decided to have the kernel leave the gateway untouched and
continue, rather than having the kernel return EINVAL.  This
produces the least astonishment :-)

Please review and if it's OK, I'll send it to gnats for the audit
trail.  Thanks,

-Paul.

Index: sys/net/rtsock.c
===
RCS file: /mnt/ncvs/src/sys/net/rtsock.c,v
retrieving revision 1.44.2.4
diff -u -r1.44.2.4 rtsock.c
--- sys/net/rtsock.c2001/07/11 09:37:37 1.44.2.4
+++ sys/net/rtsock.c2001/11/27 01:33:03
@@ -399,6 +399,14 @@
break;

case RTM_CHANGE:
+   /* Don't let the user specify non-link information
+* for a gateway if the RTF_LLINFO flag is set.
+* We'll just leave the gateway alone.
+*/
+   if (gate  (rt-rt_flags  RTF_LLINFO) 
+   gate-sa_family != AF_LINK)
+   gate = rt-rt_gateway;
+
if (gate  (error = rt_setgate(rt, rt_key(rt), gate)))
senderr(error);





To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: t/tcp behaviour has changed in -current and -stable

2001-05-12 Thread Paul Herman

On Thu, 10 May 2001, jayanth wrote:

 I would like to back out the DELAY_ACK macro changes for now. The
 ttcp code behavior has changed because of the addition of the
 DELAY_ACK macro.

 The macro is forcing the ttcp code to send an immediate SYN, ACK
 for an initial ttcp segment which has the SYN/FIN/PSH flag set.
 Instead the SYN,ACK should be delayed such that next segment
 should be SYN/FIN/PSH from the server side.

I'm hesitant to comment, because I don't really have a patch :-) but
I'll give it a whirl anyway.

Thing is, delack is still broken.  It just isn't nearly as broken now
as it was before the patch.  I'm pretty sure what you are seeing now,
worked before only because of the brokenness.  The way to go here is
to fix this problem.  Besides, if you back out the latest delack
change, ttcp will be affected by the old delack problems anyway.

I don't have a testbed for ttcp, so I'm flying blind, but how about
where you wrote:

 if (DELAY_ACK(tp)  (tp-t_flags  TF_NEEDSYN))
   
 At this point the DELAY_ACK macro returns false because there is a
 callout_pending(). Hence the TF_ACKNOW will be set.

If the new ttcp SYN,ACKs *always* get delayed, having something like:

  if ( (DELAY_ACK(tp) || this_is_a_ttcp_connection) 
(tp-t_flags  TF_NEEDSYN))

 - or -

  if ( tcp_delack_enabled 
(!callout_pending(...) || this_is_a_ttcp_connection) 
(tp-t_flags  TF_NEEDSYN))

...depending on what The Right Thing is when tcp_delack_enabled = 0.
I don't know ttcp, so of course this_is_a_ttcp_connection should be
replaced with the corresponding boolean.

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: I have delayed ACK problems

2001-02-25 Thread Paul Herman

On Sat, 24 Feb 2001, Jonathan Lemon wrote:

 On Sat, Feb 24, 2001 at 11:19:02AM -0800, Mark Peek wrote:
  Was there ever a final resolution to this problem?

 The patches are still sitting in my tree, as I've been unable
 to come up with a test case that actually makes a difference.

 The "tar cf host:..." example is bogus, as the problem here is

Jonathan is right, the patch doesn't solve the general "tar cf host:"
problem, but it was similar enough to what we were seeing in
production -- changing the MTU on lo0 to 1500 will make the
"tar cf host:" problem/solution more apparent, when host == localhost.

In anycase, we are very happy with the patch on our production
servers, as it really did solve our problem.  I believe the patch is
100% correct, it just doesn't fix 100% of the delayed ACK problems.

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-net" in the body of the message