Re: kern/127050: [carp] ipv6 does not work on carp interfaces [regression]
On 8/21/2011 1:47 AM, Ask Bjørn Hansen wrote: On Aug 19, 2011, at 1:30, Paul Herman wrote: --010305010708060807000808 Content-Type: application/gzip; name=carp_ip6_alias.patch.gz Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=carp_ip6_alias.patch.gz I wanted to try it, but gzip doesn't seem to like that file … (downloaded from http://www.freebsd.org/cgi/query-pr.cgi?pr=127050cat= ) It's base64 encoded of course -- works for me when I pipe the text into openssl base64 -d | zcat (zipped to preserve white spacing) For those craving instant satisfaction, here it is in plain text. -Paul. --- sys/netinet/ip_carp.c.orig 2011-08-19 07:52:56.0 + +++ sys/netinet/ip_carp.c 2011-08-19 07:15:03.0 + @@ -1670,9 +1670,11 @@ struct carp_if *cif; struct in6_ifaddr *ia, *ia_if; struct ip6_moptions *im6o = sc-sc_im6o; + struct in6_multi *in6m; struct in6_addr in6; int own, error; + error = 0; if (IN6_IS_ADDR_UNSPECIFIED(sin6-sin6_addr)) { @@ -1729,8 +1731,6 @@ } if (!sc-sc_naddrs6) { - struct in6_multi *in6m; - im6o-im6o_multicast_ifp = ifp; /* join CARP multicast address */ @@ -1745,24 +1745,24 @@ goto cleanup; im6o-im6o_membership[0] = in6m; im6o-im6o_num_memberships++; - - /* join solicited multicast address */ - bzero(in6, sizeof(in6)); - in6.s6_addr16[0] = htons(0xff02); - in6.s6_addr32[1] = 0; - in6.s6_addr32[2] = htonl(1); - in6.s6_addr32[3] = sin6-sin6_addr.s6_addr32[3]; - in6.s6_addr8[12] = 0xff; - if (in6_setscope(in6, ifp, NULL) != 0) - goto cleanup; - in6m = NULL; - error = in6_mc_join(ifp, in6, NULL, in6m, 0); - if (error) - goto cleanup; - im6o-im6o_membership[1] = in6m; - im6o-im6o_num_memberships++; } + /* join solicited multicast address */ + bzero(in6, sizeof(in6)); + in6.s6_addr16[0] = htons(0xff02); + in6.s6_addr32[1] = 0; + in6.s6_addr32[2] = htonl(1); + in6.s6_addr32[3] = sin6-sin6_addr.s6_addr32[3]; + in6.s6_addr8[12] = 0xff; + if (in6_setscope(in6, ifp, NULL) != 0) + goto cleanup; + in6m = NULL; + error = in6_mc_join(ifp, in6, NULL, in6m, 0); + if (error) + goto cleanup; + im6o-im6o_membership[1] = in6m; + im6o-im6o_num_memberships++; + if (!ifp-if_carp) { cif = malloc(sizeof(*cif), M_CARP, M_WAITOK|M_ZERO); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/127050: [carp] ipv6 does not work on carp interfaces [regression]
The following reply was made to PR kern/127050; it has been noted by GNATS. From: Paul Herman pher...@frenchfries.net To: bug-follo...@freebsd.org Cc: Wouter de Jong maddo...@maddog2k.net, Jacek Zapala ja...@it.pl Subject: Re: kern/127050: [carp] ipv6 does not work on carp interfaces [regression] Date: Fri, 19 Aug 2011 10:13:46 +0200 This is a multi-part message in MIME format. --010305010708060807000808 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit This one's been bothering me too, so I setup a test machine yesterday to figure out what's going on. I see two problems here. First of all, in6_ifinit() (called via in6_control() SIOCAIFADDR_IN6 - in6_update_ifa()) only calls carp_ioctl() on the first IPv6 address. Whereas in the v4 case, carp_ioctl() does get called by in_ifinit() every time. Second of all, carp_set_addr6() (called via carp_ioctl()) only joins the CARP multicast group AND the solicitation group with the first address. The attached patch against 8-STABLE fixes these issues, alias IPs are now pingable. I have not tested actual carp functionality with a 2nd BACKUP carp. What this patch doesn't address is group membership removal when alias IPs are deleted (apparently also broken.) Try it out, if it works for you guys, maybe someone more familiar with in[6]_control() can chime in here and comment on a *real* way to fix this issue. :-) Cheers, -Paul. --010305010708060807000808 Content-Type: application/gzip; name=carp_ip6_alias.patch.gz Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=carp_ip6_alias.patch.gz H4sICDwXTk4CA2NhcnBfaXA2X2FsaWFzLnBhdGNoAJVVa0+jQBT9TH/F9UsDHbDQWvramhof CVlfsZpN1hiCCDobYAhD42Pd/753mNIHtlZJyuvee+7pOXcGwzCAv/JmEuQUf3aTJvauv8sy +qi0TMsyzJ5h9cHsDszeoG3umuUBRJxrhJA19dXS/qDT/lA6HoNhdffaehfI7Doe16AGCvWM feq53sNDBiNocMQcYsBQaAgqDT2fTZMcfozAgnodaJhieuhS5ueRBn9rpMirJL6/zxPz1zSA 0Qick2v38ODqUluLAooSZBkTDNTGahShUx0mzsXhxDk5ODq60kH1BVs316inDUWtoFDUz7AU nkYvKsegsSp5k6au72XpetU7rUHH3qb6AqJSbXUG5gbh7a6p91F4cbWsQnmF59nUz0FAuci/ 4dNwuHiNLuDbwpQG9XRxclcTUtuNWZpTlnCMxjZD7ercN/a574rHIVqzBBZPo5xiYmLHlTZF E+m6QtFB9pzoUKgp5oDg29IacygnBtV2zm3XmbjCD/fmfHJ5fOicOMdHal3MD3IogTXhiBy+ Vl/vieFrW7otJfg3h9uZEU9EDbdFkaFspG8UnuN/xDHBswz6Hs9RohR54lkyVZoN+MNoAmLy YJ4GokvAUbdmuTA6emuvWBnFTcFOUR5ZzsCPAi+ZpsNqzyC+DzL+RNNb8070nAm7nJNM46U8 ToigbixYcRZRn+bBw3pqmHn/FmRMrSO4Dpy+BSxU8V4Tg42dcPVzKbNlSxZPOY6Dar6Eodn6 kNRu3Vp30sYPgVZZHanWusK2iFe8XQlXa3q3VoEpuMiY2E6wkgc591kazP5VsbrPb05PNdjB bE2kVoSXwDGCibziuZzIYjR8V6gp9wkJKvLkfawj5ry/3CM2tFjrrbXw1tjmrRxo8lV7ySfu ki+YSzZ4Sz63lnzfWbLZWPJlX0lVc7LqKvm2qWTZ03Xw2xwlWwwl5e5UfpHEZj37xOB2jTix F0XMV2fmiT1c0+Gs+NDpIgvwOHN/HTjXFz/fz9zfx1cXSPw/VIIrJwgIAAA= --010305010708060807000808-- ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Question on TCP segment sizes
Hi -net, While pondering sending large files across a jumbo frame network using FreeBSD, I decided to first see how well the loopback interface does. Using ttcp on the same machine: ttcp -s -r ttcp -s -t 127.0.0.1 I noticed that although the MSS is 16k, I don't ever see a full 16k segment. In fact, 16k packets in the form of one 2k packet and one 14k packet are sent, as the following tcpdump output shows: .706018 127.0.0.1.1026 127.0.0.1.5001: S 3254883299:3254883299(0) win 57344 mss 16344,nop,wscale 0,nop,nop,timestamp 112512 0 (DF) .706108 127.0.0.1.5001 127.0.0.1.1026: S 2952251081:2952251081(0) ack 3254883300 win 57344 mss 16344,nop,wscale 0,nop,nop,timestamp 112512 112512 (DF) .706140 127.0.0.1.1026 127.0.0.1.5001: . ack 1 win 57344 nop,nop,timestamp 112512 112512 (DF) .707454 127.0.0.1.1026 127.0.0.1.5001: P 1:8193(8192) ack 1 win 57344 nop,nop,timestamp 112512 112512 (DF) .708308 127.0.0.1.1026 127.0.0.1.5001: P 8193:22529(14336) ack 1 win 57344 nop,nop,timestamp 112512 112512 (DF) .708346 127.0.0.1.5001 127.0.0.1.1026: . ack 22529 win 43008 nop,nop,timestamp 112512 112512 (DF) .708375 127.0.0.1.1026 127.0.0.1.5001: P 22529:24577(2048) ack 1 win 57344 nop,nop,timestamp 112512 112512 (DF) .708508 127.0.0.1.1026 127.0.0.1.5001: P 24577:38913(14336) ack 1 win 57344 nop,nop,timestamp 112512 112512 (DF) .708530 127.0.0.1.5001 127.0.0.1.1026: . ack 38913 win 43008 nop,nop,timestamp 112512 112512 (DF) .708549 127.0.0.1.1026 127.0.0.1.5001: P 38913:40961(2048) ack 1 win 57344 nop,nop,timestamp 112512 112512 (DF) .708617 127.0.0.1.1026 127.0.0.1.5001: P 40961:55297(14336) ack 1 win 57344 nop,nop,timestamp 112512 112512 (DF) .708638 127.0.0.1.5001 127.0.0.1.1026: . ack 55297 win 43008 nop,nop,timestamp 112512 112512 (DF) [...repeats...] The same happens phenomenon with FTP so I don't think there's any voodoo going on with ttcp. Interestingly enough, raising the MTU on lo0 above 16k changes nothing. Also, IPv6 (FTP) shows similar behavior: .449842 ::1.1036 ::1.49153: P 92161:93185(1024) ack 1 win 57344 nop,nop,timestamp 292774 292774 .449992 ::1.1036 ::1.49153: P 93185:107521(14336) ack 1 win 57344 nop,nop,timestamp 292774 292774 .450018 ::1.49153 ::1.1036: . ack 107521 win 43008 nop,nop,timestamp 292774 292774 [flowlabel 0x645a7] .450099 ::1.1036 ::1.49153: P 107521:108545(1024) ack 1 win 57344 nop,nop,timestamp 292774 292774 .450250 ::1.1036 ::1.49153: P 108545:122881(14336) ack 1 win 57344 nop,nop,timestamp 292774 292774 .450275 ::1.49153 ::1.1036: . ack 122881 win 43008 nop,nop,timestamp 292774 292774 [flowlabel 0x645a7] .450354 ::1.1036 ::1.49153: P 122881:123905(1024) ack 1 win 57344 nop,nop,timestamp 292774 292774 .450504 ::1.1036 ::1.49153: P 123905:138241(14336) ack 1 win 57344 nop,nop,timestamp 292774 292774 So the question is, why don't we get two full sized 16344 (or at least two 14336) sized segments? It seems it would be more efficient that way, no? 4.9-PRERELEASE from yesterday, all sysctls are system defaults... -Paul. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: delayed ACK
On Mon, 14 Oct 2002, Steve Francis wrote: Kirill Ponomarew wrote: is it recommended to use net.inet.tcp.delayed_ack=0 on the machines with heavy network traffic ? If you want to increase your network traffic for no particular reason, and increase load on your server, then yes. Otherwise no. Not true. Although some bugs have been fixed in 4.3, FreeBSD's delayed ACKs will still degrade your performance dramatically in some cases. For now, the best advice I could give is to benchmark your client machine with and without delayed ACKs and see which works best for your environment. -Paul. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
Re: delayed ACK
On Tue, 15 Oct 2002, Lars Eggert wrote: Paul Herman wrote: Not true. Although some bugs have been fixed in 4.3, FreeBSD's delayed ACKs will still degrade your performance dramatically in some cases. I'm sorry, but such statements without a packet trace that exhibits the problem are just not useful. /me reels line back in Aha! Another victim who is willing to take a look at this! :-) It's an issue that was left unresolved in kern/24645. Bruce Evans brought this to my attention back during the unrelated I have delayed ACK problems thread on -net in January of 2001 and I then passed it on to jlemon. If you need a packet trace, let me know, but you should be able to reproduce it yourself. Even today on my 4.7-PRERELEASE I still get: mammoth# sysctl net.inet.tcp.delayed_ack=0 net.inet.tcp.delayed_ack: 1 - 0 mammoth# time tar cf 127.0.0.1:/tmp/foo /kernel 0.000u 0.041s 0:00.33 12.1% 350+300k 0+0io 0pf+0w mammoth# sysctl net.inet.tcp.delayed_ack=1 net.inet.tcp.delayed_ack: 0 - 1 mammoth# time tar cf 127.0.0.1:/tmp/foo /kernel 0.014u 0.033s 0:45.90 0.0% 700+600k 0+0io 0pf+0w ^^^ It seems that lowering lo0 mtu to 1500 makes this particular problem go away. The magic mtu size is 2100. This makes me think that this is a big problem across GigE using 8K jumbo frames, not sure. Also, taring over the IPv6 lo0 interface seems to work OK. No idea what causes this. -Paul. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
Re: arp_rtrequest: bad gateway value
On Fri, 23 Nov 2001, Paul Herman wrote: On Thu, 22 Nov 2001, Ruslan Ermilov wrote: On Wed, Nov 21, 2001 at 05:32:27PM -0800, Paul Herman wrote: Hi, I'd like to pick some brains before I file a PR. There's already a PR open on this, kern/29170. [...] Here's a patch against 4.4-RELEASE that fixes this problem. As mentioned before, the problem happens when a gateway with the RTF_LLINFO set gets polluted with non-link information. routed and route are both culprits. BTW, KAME does this as well by putting AF_INET6 data into gateways with the RTF_LLINFO flag set, which I don't think is a good idea, but it calls rt_setgate() directly and isn't affected by this patch. I've decided to have the kernel leave the gateway untouched and continue, rather than having the kernel return EINVAL. This produces the least astonishment :-) Please review and if it's OK, I'll send it to gnats for the audit trail. Thanks, -Paul. Index: sys/net/rtsock.c === RCS file: /mnt/ncvs/src/sys/net/rtsock.c,v retrieving revision 1.44.2.4 diff -u -r1.44.2.4 rtsock.c --- sys/net/rtsock.c2001/07/11 09:37:37 1.44.2.4 +++ sys/net/rtsock.c2001/11/27 01:33:03 @@ -399,6 +399,14 @@ break; case RTM_CHANGE: + /* Don't let the user specify non-link information +* for a gateway if the RTF_LLINFO flag is set. +* We'll just leave the gateway alone. +*/ + if (gate (rt-rt_flags RTF_LLINFO) + gate-sa_family != AF_LINK) + gate = rt-rt_gateway; + if (gate (error = rt_setgate(rt, rt_key(rt), gate))) senderr(error); To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
Re: t/tcp behaviour has changed in -current and -stable
On Thu, 10 May 2001, jayanth wrote: I would like to back out the DELAY_ACK macro changes for now. The ttcp code behavior has changed because of the addition of the DELAY_ACK macro. The macro is forcing the ttcp code to send an immediate SYN, ACK for an initial ttcp segment which has the SYN/FIN/PSH flag set. Instead the SYN,ACK should be delayed such that next segment should be SYN/FIN/PSH from the server side. I'm hesitant to comment, because I don't really have a patch :-) but I'll give it a whirl anyway. Thing is, delack is still broken. It just isn't nearly as broken now as it was before the patch. I'm pretty sure what you are seeing now, worked before only because of the brokenness. The way to go here is to fix this problem. Besides, if you back out the latest delack change, ttcp will be affected by the old delack problems anyway. I don't have a testbed for ttcp, so I'm flying blind, but how about where you wrote: if (DELAY_ACK(tp) (tp-t_flags TF_NEEDSYN)) At this point the DELAY_ACK macro returns false because there is a callout_pending(). Hence the TF_ACKNOW will be set. If the new ttcp SYN,ACKs *always* get delayed, having something like: if ( (DELAY_ACK(tp) || this_is_a_ttcp_connection) (tp-t_flags TF_NEEDSYN)) - or - if ( tcp_delack_enabled (!callout_pending(...) || this_is_a_ttcp_connection) (tp-t_flags TF_NEEDSYN)) ...depending on what The Right Thing is when tcp_delack_enabled = 0. I don't know ttcp, so of course this_is_a_ttcp_connection should be replaced with the corresponding boolean. -Paul. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-net in the body of the message
Re: I have delayed ACK problems
On Sat, 24 Feb 2001, Jonathan Lemon wrote: On Sat, Feb 24, 2001 at 11:19:02AM -0800, Mark Peek wrote: Was there ever a final resolution to this problem? The patches are still sitting in my tree, as I've been unable to come up with a test case that actually makes a difference. The "tar cf host:..." example is bogus, as the problem here is Jonathan is right, the patch doesn't solve the general "tar cf host:" problem, but it was similar enough to what we were seeing in production -- changing the MTU on lo0 to 1500 will make the "tar cf host:" problem/solution more apparent, when host == localhost. In anycase, we are very happy with the patch on our production servers, as it really did solve our problem. I believe the patch is 100% correct, it just doesn't fix 100% of the delayed ACK problems. -Paul. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-net" in the body of the message