Re: Rename of MSIZE kernel option..

2002-10-15 Thread Giorgos Keramidas

On 2002-10-15 00:12, Nicolas Christin [EMAIL PROTECTED] wrote:
 On Mon, 14 Oct 2002, Andrew Gallatin wrote:
Would people be open to renaming the 'MSIZE' kernel option to something
more specific such as 'MBUF_SIZE' or 'MBUFSIZE'?  Using 'MSIZE' can
 
  No.
 
  MSIZE is a traditional BSDism.  Everybody else still uses it.
  Even AIX and MacOS.  I really don't like the idea of changing this.

 True, but John is right, it's too generic a name. The argument it's
 been forever so we can't change it seems a bit fallacious to me:

True.  But that sort of reasoning might lead us one day to rename
macros and functions like m_get() to mbuf_get() or similar.  That
doesn't seem like a good idea :-/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: How to add bpf support to if_atmsubr.c?

2002-10-15 Thread Harti Brandt

On Tue, 15 Oct 2002, Bruce M Simpson wrote:

BMSOn Mon, Oct 14, 2002 at 11:13:05PM -0700, Guy Harris wrote:
BMS The current CVS versions of libpcap and tcpdump, and the current
BMS released version of Ethereal, support a DLT_SUNATM DLT_ type.  SunATM's
BMS DLPI interface supplies packets with a 4-byte pseudo-header, consisting of:
BMS[snip]
BMS
BMSJust FYI...
BMS
BMSThis sounds very similar to the promiscuous cell receive option on ENI's
BMSSpeedStream 5861 router. I found the raw hex cell output was essentially
BMSa 4 byte ATM UNI header omitting the CRC byte, and the 48 bytes of the raw
BMSAAL5 cell payload.

The marconi HE cards have the same format although they have no promiscous
mode (although it would be easy to configure all unused connections to
receveive to a free receive group, the question is whether you want this
(35/packets per second for OC3)). My driver allows you to receive
cells (i.e. AAL0) on any of the supported connections.

BMSIs there any open source support for the SunATM PCI cards? I see a few of
BMSthem cropping up on eBay from time to time. It might be worth finding out
BMSwhich ASICs they use, I doubt Sun would engineer their own.

Does Sun still make ATM cards? As far as I remember I saw the last SBUS
cards a couple of years ago.

harti
-- 
harti brandt, http://www.fokus.gmd.de/research/cc/cats/employees/hartmut.brandt/private
  [EMAIL PROTECTED], [EMAIL PROTECTED]


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



which L2TP server ?

2002-10-15 Thread Alessandro de Manzano

Hello!

I'm looking for a good L2TP server for FreeBSD, someone knows it ?

If I'm right MPD does not (yet?) support L2TP.


Thanks in advance!


-- 

bye!

Ale


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RFR: ping(8) patches: do not fragment, TOS, maximum payload

2002-10-15 Thread Maxim Konovalov


Hello,

I have made a patch set for ping(8). I'll appreciate your comments.

I did not include patches #3 and #4, they are stylistic mostly (based
on BDE's style patch).

A cumulative patch is there:

http://people.freebsd.org/~maxim/p.cumulative

#1, Print strict source routing option. Requested by David Wang
[EMAIL PROTECTED].

Index: ping.c
===
RCS file: /home/maxim/cvs/ping/ping.c,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -r1.1 -r1.2
--- ping.c  15 Oct 2002 11:56:58 -  1.1
+++ ping.c  15 Oct 2002 11:57:53 -  1.2
@@ -953,7 +953,9 @@
hlen = 0;
break;
case IPOPT_LSRR:
-   (void)printf(\nLSRR: );
+   case IPOPT_SSRR:
+   (void)printf(*cp == IPOPT_LSRR ?
+   \nLSRR:  : \nSSRR: );
j = cp[IPOPT_OLEN] - IPOPT_MINOFF + 1;
hlen -= 2;
cp += 2;

%%%

#2, Implement -D (do not fragment) and -z (TOS) options. Obtained from
OpenBSD, bin/35843.

Index: ping.c
===
RCS file: /home/maxim/cvs/ping/ping.c,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- ping.c  15 Oct 2002 11:57:53 -  1.2
+++ ping.c  15 Oct 2002 12:04:10 -  1.3
@@ -67,6 +67,7 @@
  */

 #include sys/param.h /* NB: we rely on this for sys/types.h */
+#include sys/sysctl.h

 #include ctype.h
 #include err.h
@@ -107,6 +108,7 @@
 #defineMAXPAYLOAD  (IP_MAXPACKET - MAXIPLEN - MINICMPLEN)
 #defineMAXWAIT 10  /* max seconds to wait for response */
 #defineMAXALARM(60 * 60)   /* max seconds for alarm timeout */
+#defineMAXTOS  255

 #defineA(bit)  rcvd_tbl[(bit)3]  /* identify byte in array */
 #defineB(bit)  (1  ((bit)  0x07))   /* identify bit in byte */
@@ -138,6 +140,7 @@
 #defineF_TTL   0x8000
 #defineF_MISSED0x1
 #defineF_ONCE  0x2
+#defineF_HDRINCL   0x4

 /*
  * MAX_DUP_CHK is the number of bits in received table, i.e. the maximum
@@ -151,7 +154,7 @@
 struct sockaddr_in whereto;/* who to ping */
 int datalen = DEFDATALEN;
 int s; /* socket file descriptor */
-u_char outpack[MINICMPLEN + MAXPAYLOAD];
+u_char outpackhdr[IP_MAXPACKET], *outpack;
 char BSPACE = '\b';/* characters written for flood */
 char BBELL = '\a'; /* characters written for MISSED and AUDIBLE */
 char DOT = '.';
@@ -201,6 +204,7 @@
 {
struct in_addr ifaddr;
struct iovec iov;
+   struct ip *ip;
struct msghdr msg;
struct sigaction si_sa;
struct sockaddr_in from, sin;
@@ -209,13 +213,15 @@
struct hostent *hp;
struct sockaddr_in *to;
double t;
+   size_t sz;
u_char *datap, packet[IP_MAXPACKET];
char *ep, *source, *target;
 #ifdef IPSEC_POLICY_IPSEC
char *policy_in, *policy_out;
 #endif
u_long alarmtimeout, ultmp;
-   int ch, hold, i, packlen, preload, sockerrno, almost_done = 0, ttl;
+   int ch, df, hold, i, mib[4], packlen, preload, sockerrno,
+   almost_done = 0, tos, ttl;
char ctrl[CMSG_SPACE(sizeof(struct timeval))];
char hnamebuf[MAXHOSTNAMELEN], snamebuf[MAXHOSTNAMELEN];
 #ifdef IP_OPTIONS
@@ -239,11 +245,12 @@
setuid(getuid());
uid = getuid();

-   alarmtimeout = preload = 0;
+   alarmtimeout = df = preload = tos = 0;

+   outpack = outpackhdr + sizeof(struct ip);
datap = outpack[MINICMPLEN + PHDR_LEN];
while ((ch = getopt(argc, argv,
-   AI:LQRS:T:c:adfi:l:m:nop:qrs:t:v
+   ADI:LQRS:T:c:adfi:l:m:nop:qrs:t:vz:
 #ifdef IPSEC
 #ifdef IPSEC_POLICY_IPSEC
P:
@@ -266,6 +273,10 @@
optarg);
npackets = ultmp;
break;
+   case 'D':
+   options |= F_HDRINCL;
+   df = 1;
+   break;
case 'd':
options |= F_SO_DEBUG;
break;
@@ -390,6 +401,13 @@
else
errx(1, invalid security policy);
break;
+   case 'z':
+   options |= F_HDRINCL;
+   ultmp = strtoul(optarg, ep, 0);
+   if (*ep || ep == optarg || ultmp  MAXTOS)
+   errx(EX_USAGE, invalid TOS: `%s', optarg);
+   tos = ultmp;
+   break;
 #endif /*IPSEC_POLICY_IPSEC*/
 #endif /*IPSEC*/
default:
@@ -509,6 +527,28 @@
 #endif 

Re: which L2TP server ?

2002-10-15 Thread Michael Sierchio

Alessandro de Manzano wrote:
 Hello!
 
 I'm looking for a good L2TP server for FreeBSD, someone knows it ?
 
 If I'm right MPD does not (yet?) support L2TP.
 
 
 Thanks in advance!
 
 

man ng_l2tp

DESCRIPTION
  The ng_l2tp node type implements the encapsulation layer of the L2TP pro-
  tocol as described in RFC 2661.  This includes adding the L2TP packet
  header for outgoing packets and verifying and removing it for incoming
  packets.  The node maintains the L2TP sequence number state and handles
  control session packet acknowledgment and retransmission.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: which L2TP server ?

2002-10-15 Thread Alessandro de Manzano

On Tue, Oct 15, 2002 at 07:10:29AM -0700, Michael Sierchio wrote:

 man ng_l2tp
 
 DESCRIPTION
   The ng_l2tp node type implements the encapsulation layer of the L2TP pro-
   tocol as described in RFC 2661.  This includes adding the L2TP packet

thanks, but I'm looking for something at higher level, also easier to
setup.

As MPD (actually it use ng_ppp and others), for example.

tnx!

-- 

bye!

Ale


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: which L2TP server ?

2002-10-15 Thread Tilman Linneweh

In arved.freebsd.net, you wrote:
 On Tue, Oct 15, 2002 at 07:10:29AM -0700, Michael Sierchio wrote:
 
 man ng_l2tp
 
 DESCRIPTION
   The ng_l2tp node type implements the encapsulation layer of the L2TP pro-
   tocol as described in RFC 2661.  This includes adding the L2TP packet
 
 thanks, but I'm looking for something at higher level, also easier to
 setup.
 
 As MPD (actually it use ng_ppp and others), for example.

I once compiled the Linux one from www.l2tpd.org (port at
http://stud3.tuwien.ac.at/~e0025974/bsdsrc/l2tpd.shar), but never
tested, if it really worked on FreeBSD.

regards
arved



msg07094/pgp0.pgp
Description: PGP signature


Re: How to add bpf support to if_atmsubr.c?

2002-10-15 Thread Guy Harris

On Tue, Oct 15, 2002 at 11:54:52AM +0100, Bruce M Simpson wrote:
 This sounds very similar to the promiscuous cell receive option on ENI's
 SpeedStream 5861 router. I found the raw hex cell output was essentially
 a 4 byte ATM UNI header omitting the CRC byte, and the 48 bytes of the raw
 AAL5 cell payload.

Similar, but not the same; I doubt there's any hardware significance to
the VPI/VCI part of the header, and the type field is probably put there
by the driver.  (Also, the DLPI interface supplies reassembled AAL5
PDUs, not raw cells; I don't know what it does for other AALs, except
for the signalling AAL where it again supplies reassembled packets.)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: How to add bpf support to if_atmsubr.c?

2002-10-15 Thread Guy Harris

On Tue, Oct 15, 2002 at 01:01:05PM +0200, Harti Brandt wrote:
 Does Sun still make ATM cards? As far as I remember I saw the last SBUS
 cards a couple of years ago.

They still have a Web page for SunATM:


http://www.sun.com/products-n-solutions/hw/networking/connectivity/sunatm/index.html

and say that they've introduced a 4.0 version of SunATM (which runs in
64-bit mode on Solaris 7) and also list PCI adapters in addition to the
SBus adapters.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: which L2TP server ?

2002-10-15 Thread Bill Baumann


A year  a half ago, the l2tpd interface and code was still in its
infancy.  If all you seek is to create tunnels/sessions, and don't care
about security or other more complex l2tp issues, it should work ok.

I developed my own L2TP stack for Linux with much higher level of
functionality.  It would take some porting effort.  Only a small effort
was made on usability, so that could be an issue too.

http://sourceforge.net/projects/l2tp/

I would also suggest going to http://sourceforge.net and search for l2tp.
There are a few other projects out there besides these two.

Regards,
Bill Baumann


On Tue, 15 Oct 2002, Tilman Linneweh wrote:

 In arved.freebsd.net, you wrote:
  On Tue, Oct 15, 2002 at 07:10:29AM -0700, Michael Sierchio wrote:
  
  man ng_l2tp
  
  DESCRIPTION
The ng_l2tp node type implements the encapsulation layer of the L2TP pro-
tocol as described in RFC 2661.  This includes adding the L2TP packet
  
  thanks, but I'm looking for something at higher level, also easier to
  setup.
  
  As MPD (actually it use ng_ppp and others), for example.
 
 I once compiled the Linux one from www.l2tpd.org (port at
 http://stud3.tuwien.ac.at/~e0025974/bsdsrc/l2tpd.shar), but never
 tested, if it really worked on FreeBSD.
 
 regards
 arved
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Paul Herman

On Mon, 14 Oct 2002, Steve Francis wrote:

 Kirill Ponomarew wrote:
 
  is it recommended to use net.inet.tcp.delayed_ack=0 on the machines with
  heavy network traffic ?
 
 If you want to increase your network traffic for no particular reason,
 and increase load on your server, then yes.

 Otherwise no.

Not true.  Although some bugs have been fixed in 4.3, FreeBSD's
delayed ACKs will still degrade your performance dramatically in
some cases.

For now, the best advice I could give is to benchmark your client
machine with and without delayed ACKs and see which works best for
your environment.

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: which L2TP server ?

2002-10-15 Thread Vincent Jardin

There is a new L2TP project from Roaring Penguin. It supports both LAC and 
LNS features:
http://sourceforge.net/projects/rp-l2tp

It requires pppd. It has been written for Linux, however it should support 
FreeBSD easily.

Vincent

Le Mardi 15 Octobre 2002 14:15, Alessandro de Manzano a écrit :
 Hello!

 I'm looking for a good L2TP server for FreeBSD, someone knows it ?

 If I'm right MPD does not (yet?) support L2TP.


 Thanks in advance!

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



dynamic load of em/fxp/bge

2002-10-15 Thread Don Bowman


I am trying to load the if_em, if_fxp, if_bge drivers
via /boot/loader.conf.

I've added

if_fxp_load=YES
if_bge_load=YES
if_em_load=YES

The problem is that the bge driver doesn't load. It will
if I manually load it after startup with kldload. The issue
seems to be a dependency on miibus, both fxp and bge want
to load it, bge gets an error that its already loaded.

I tried putting 'miibus_load=YES' in loader.conf, but the
same affect is seen.

I've tried from the boot prompt doing an explicit load of these
manually in each order, but to no avail.

As a work-around, I've placed an kldload if_bge in rc.network
before the 'ifconfig -l'.

Any suggestions on why the fxp/bge don't play nice when loaded
automatically, but will work if run manually? Is there a timing
thing that the fxp hasn't initialised its miibus yet?

I have:

fxp0
fxp1
bge0

in this particular machine. The bge will get miibus2 (eventually),
leaving fxp0 to have miibus0, fxp1 to have miibus1 I think.

Suggestions?

--don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: RFC: eliminating the _IP_VHL hack.

2002-10-15 Thread Luigi Rizzo

On Wed, Oct 16, 2002 at 12:17:13AM +0200, Poul-Henning Kamp wrote:
...
 I would therefore propose to eliminate the _IP_VHL hack from the kernel

yes, go for it.

cheers
luigi

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



RFC: eliminating the _IP_VHL hack.

2002-10-15 Thread Garrett Wollman

On Wed, 16 Oct 2002 00:17:13 +0200, Poul-Henning Kamp [EMAIL PROTECTED] said:

 In the meantime absolutely no code has picked up on this idea,

It was copied in spirit from OSF/1.

 The side effect of having some source-files using the _IP_VHL hack and
 some not is that sizeof(struct ip) varies from file to file,

Not so.  Any compiler which allocates different amounts of storage to
one eight-bit member versus two four-bit bitfield members is seriously
broken (and would defeat the whole purpose).

 I would therefore propose to eliminate the _IP_VHL hack from the kernel
 to end this state of (potential) confusion, and invite comments to the
 following patch:

Much better to delete the bogus BYTE_ORDER kluge from ip.h.  (Note
that the definition of the bitfields in question has nothing
whatsoever to do with the actual byte order in use; it simply relies
on the historical behavior of compilers which allocated space for
bitfields in BYTE_ORDER order.)

-GAWollman


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



ENOBUFS

2002-10-15 Thread Garrett Wollman

On Wed, 16 Oct 2002 00:53:46 +0300, Petri Helenius [EMAIL PROTECTED] said:

 My processes writing to SOCK_DGRAM sockets are getting ENOBUFS 

Probably means that your outgoing interface queue is filling up.
ENOBUFS is the only way the kernel has to tell you ``slow down!''.

-GAWollman


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Steve Francis

Lars Eggert wrote:

Paul Herman wrote:
  

Not true.  Although some bugs have been fixed in 4.3, FreeBSD's
delayed ACKs will still degrade your performance dramatically in
some cases.



I'm sorry, but such statements without a packet trace that exhibits the 
problem are just not useful.

Lars
  

He's probably referring to poorly behaved windows clients, on certain 
applications, if you leave net.inet.tcp.slowstart_flightsize at default.

Incidentally, why are not the defaults on 
net.inet.tcp.slowstart_flightsize higher?
RFC2414 seems to indicate it should be higher. Solaris in version 8 and 
later default to 4 for this value.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Lars Eggert

Steve Francis wrote:

 He's probably referring to poorly behaved windows clients, on certain 
 applications, if you leave net.inet.tcp.slowstart_flightsize at default.

Ah. Well, that's a Windows problem :-)

 Incidentally, why are not the defaults on 
 net.inet.tcp.slowstart_flightsize higher?
 RFC2414 seems to indicate it should be higher. Solaris in version 8 and 
 later default to 4 for this value.

I've been running with 4 for years w/o problems. so i'm all for the change.

Lars
-- 
Lars Eggert [EMAIL PROTECTED]   USC Information Sciences Institute



smime.p7s
Description: S/MIME Cryptographic Signature


Re: ENOBUFS

2002-10-15 Thread Petri Helenius


 What rate are you sending these packets at? A standard interface queue
 length is 50 packets, you get ENOBUFS when it's full.

This might explain the phenomenan. (packets are going out bursty, with average
hovering at ~500Mbps:ish) I recomplied kernel with IFQ_MAXLEN of 5000
but there seems to be no change in the behaviour. How do I make sure that
em-interface is running 66/64 and is there a way to see interface queue depth?
em0: Intel(R) PRO/1000 Network Connection, Version - 1.3.14 port 0x3040-0x307f
mem 0xfc22-0xfc23 irq 17 at device 3.0 on pci2
em0:  Speed:1000 Mbps  Duplex:Full
pcib2: PCI to PCI bridge (vendor=8086 device=1460) at device 29.0 on pci1
IOAPIC #2 intpin 0 - irq 16
IOAPIC #2 intpin 6 - irq 17
IOAPIC #2 intpin 7 - irq 18
pci2: PCI bus on pcib2

The OS is 4.7-RELEASE.

Pete



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: ENOBUFS

2002-10-15 Thread Petri Helenius

 
 Probably means that your outgoing interface queue is filling up.
 ENOBUFS is the only way the kernel has to tell you ``slow down!''.
 
How much should I be able to send to two em interfaces on one
66/64 PCI ?

Pete



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: ENOBUFS

2002-10-15 Thread Luigi Rizzo

On Wed, Oct 16, 2002 at 02:04:11AM +0300, Petri Helenius wrote:
 
  What rate are you sending these packets at? A standard interface queue
  length is 50 packets, you get ENOBUFS when it's full.
 
 This might explain the phenomenan. (packets are going out bursty, with average
 hovering at ~500Mbps:ish) I recomplied kernel with IFQ_MAXLEN of 5000
 but there seems to be no change in the behaviour. How do I make sure that

how large are the packets and how fast is the box ?
on a fast box you should be able to generate packets faster than wire
speed for sizes around 500bytes, meaning that you are going to saturate
the queue no matter how large it is.

cheers
luigi

 em-interface is running 66/64 and is there a way to see interface queue depth?
 em0: Intel(R) PRO/1000 Network Connection, Version - 1.3.14 port 0x3040-0x307f
 mem 0xfc22-0xfc23 irq 17 at device 3.0 on pci2
 em0:  Speed:1000 Mbps  Duplex:Full
 pcib2: PCI to PCI bridge (vendor=8086 device=1460) at device 29.0 on pci1
 IOAPIC #2 intpin 0 - irq 16
 IOAPIC #2 intpin 6 - irq 17
 IOAPIC #2 intpin 7 - irq 18
 pci2: PCI bus on pcib2
 
 The OS is 4.7-RELEASE.
 
 Pete
 
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-net in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: ENOBUFS

2002-10-15 Thread Lars Eggert

Petri Helenius wrote:
Probably means that your outgoing interface queue is filling up.
ENOBUFS is the only way the kernel has to tell you ``slow down!''.

 
 How much should I be able to send to two em interfaces on one
 66/64 PCI ?

I've seen netperf UDP throughputs of ~950Mpbs with a fiber em card and 
4K datagrams on a 2.4Ghz P4.

Lars
-- 
Lars Eggert [EMAIL PROTECTED]   USC Information Sciences Institute



smime.p7s
Description: S/MIME Cryptographic Signature


Re: RFC: eliminating the _IP_VHL hack.

2002-10-15 Thread Nate Lawson

On Wed, 16 Oct 2002, Poul-Henning Kamp wrote:
 almost 7 years ago, this commit introduced the _IP_VHL hack in our
 IP-stack:
 
 ] revision 1.7
 ] date: 1995/12/21 21:20:27;  author: wollman;  state: Exp;  lines: +5 -1
 ] If _IP_VHL is defined, declare a single ip_vhl member in struct ip rather
 ] than separate ip_v and ip_hl members.  Should have no effect on current code,
 ] but I'd eventually like to get rid of those obnoxious bitfields completely.
 
 We can argue a lot about how long time we should wait for eventually,
 but I would say that 7 years is far too long, considering the status:

Fine by me.

 RCS file: /home/ncvs/src/sys/netinet/ip_icmp.c,v
 retrieving revision 1.70
 diff -u -r1.70 ip_icmp.c
 --- ip_icmp.c 1 Aug 2002 03:53:04 -   1.70
 +++ ip_icmp.c 15 Oct 2002 22:05:23 -
 @@ -51,7 +51,6 @@
  #include net/if_types.h
  #include net/route.h
  
 -#define _IP_VHL
  #include netinet/in.h
  #include netinet/in_systm.h
  #include netinet/in_var.h
 @@ -128,7 +127,7 @@
   struct ifnet *destifp;
  {
   register struct ip *oip = mtod(n, struct ip *), *nip;
 - register unsigned oiplen = IP_VHL_HL(oip-ip_vhl)  2;
 + register unsigned oiplen = oip-ip_hl  2;
   register struct icmp *icp;
   register struct mbuf *m;
   unsigned icmplen;
 @@ -214,7 +213,8 @@
   nip = mtod(m, struct ip *);
   bcopy((caddr_t)oip, (caddr_t)nip, sizeof(struct ip));
   nip-ip_len = m-m_len;
 - nip-ip_vhl = IP_VHL_BORING;
 + nip-ip_v = IPVERSION;
 + nip-ip_hl = 5;

I think there is a manifest constant for the default ipv4 header size but
can't remember it right now.  

-Nate


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Paul Herman

On Tue, 15 Oct 2002, Lars Eggert wrote:

 Paul Herman wrote:
 
  Not true.  Although some bugs have been fixed in 4.3, FreeBSD's
  delayed ACKs will still degrade your performance dramatically in
  some cases.

 I'm sorry, but such statements without a packet trace that exhibits the
 problem are just not useful.

/me reels line back in

Aha! Another victim who is willing to take a look at this! :-)

It's an issue that was left unresolved in kern/24645.  Bruce Evans
brought this to my attention back during the unrelated I have
delayed ACK problems thread on -net in January of 2001 and I then
passed it on to jlemon.  If you need a packet trace, let me know,
but you should be able to reproduce it yourself.  Even today on my
4.7-PRERELEASE I still get:

  mammoth# sysctl net.inet.tcp.delayed_ack=0
  net.inet.tcp.delayed_ack: 1 - 0
  mammoth# time tar cf 127.0.0.1:/tmp/foo /kernel
  0.000u 0.041s 0:00.33 12.1% 350+300k 0+0io 0pf+0w

  mammoth# sysctl net.inet.tcp.delayed_ack=1
  net.inet.tcp.delayed_ack: 0 - 1
  mammoth# time tar cf 127.0.0.1:/tmp/foo /kernel
  0.014u 0.033s 0:45.90 0.0%  700+600k 0+0io 0pf+0w
^^^

It seems that lowering lo0 mtu to 1500 makes this particular
problem go away.  The magic mtu size is 2100.  This makes me think
that this is a big problem across GigE using 8K jumbo frames, not
sure.  Also, taring over the IPv6 lo0 interface seems to work OK.

No idea what causes this.

-Paul.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: RFC: eliminating the _IP_VHL hack.

2002-10-15 Thread Jeffrey Hsu

   The side effect of having some source-files using the _IP_VHL hack and
   some not is that sizeof(struct ip) varies from file to file, which at
   best is confusing an at worst the source of some really evil bugs.

   I would therefore propose to eliminate the _IP_VHL hack from the kernel
   to end this state of (potential) confusion

This problem could be solved more easily by changing the u_int back
to an u_char, as it used to be before rev 1.15:

Index: ip.h
===
RCS file: /home/ncvs/src/sys/netinet/ip.h,v
retrieving revision 1.19
diff -u -r1.19 ip.h
--- ip.h14 Dec 2001 19:37:32 -  1.19
+++ ip.h16 Oct 2002 01:15:48 -
@@ -51,11 +51,11 @@
u_char  ip_vhl; /* version  4 | header length  2 */
 #else
 #if BYTE_ORDER == LITTLE_ENDIAN
-   u_int   ip_hl:4,/* header length */
+   u_char  ip_hl:4,/* header length */
ip_v:4; /* version */
 #endif
 #if BYTE_ORDER == BIG_ENDIAN
-   u_int   ip_v:4, /* version */
+   u_char  ip_v:4, /* version */
ip_hl:4;/* header length */
 #endif
 #endif /* not _IP_VHL */

But, if we were to pick one or the other to discard, I would keep the
IP_VHL because that field really is a byte in the IP header


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Luigi Rizzo

this smells a lot as a bad interaction between default window
size and mtu -- loopback has 16k default, maybe tar uses a
smallish window (32k is default now for net.inet.tcp.sendspace,
but used to be 16k at the time), which means only 1 or 2 packets in
flight at once, meaning that many times you get the 200ms delay
and your throughput goes way down.

cheers
luigi

On Tue, Oct 15, 2002 at 05:25:42PM -0700, Paul Herman wrote:
...
 Aha! Another victim who is willing to take a look at this! :-)
 
 It's an issue that was left unresolved in kern/24645.  Bruce Evans
 brought this to my attention back during the unrelated I have
 delayed ACK problems thread on -net in January of 2001 and I then
 passed it on to jlemon.  If you need a packet trace, let me know,
 but you should be able to reproduce it yourself.  Even today on my
 4.7-PRERELEASE I still get:
 
   mammoth# sysctl net.inet.tcp.delayed_ack=0
   net.inet.tcp.delayed_ack: 1 - 0
   mammoth# time tar cf 127.0.0.1:/tmp/foo /kernel
   0.000u 0.041s 0:00.33 12.1% 350+300k 0+0io 0pf+0w
 
   mammoth# sysctl net.inet.tcp.delayed_ack=1
   net.inet.tcp.delayed_ack: 0 - 1
   mammoth# time tar cf 127.0.0.1:/tmp/foo /kernel
   0.014u 0.033s 0:45.90 0.0%  700+600k 0+0io 0pf+0w
 ^^^
 
 It seems that lowering lo0 mtu to 1500 makes this particular
 problem go away.  The magic mtu size is 2100.  This makes me think
 that this is a big problem across GigE using 8K jumbo frames, not
 sure.  Also, taring over the IPv6 lo0 interface seems to work OK.
 
 No idea what causes this.
 
 -Paul.
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-net in the body of the message

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Mike Silbersack


On Tue, 15 Oct 2002, Luigi Rizzo wrote:

 this smells a lot as a bad interaction between default window
 size and mtu -- loopback has 16k default, maybe tar uses a
 smallish window (32k is default now for net.inet.tcp.sendspace,
 but used to be 16k at the time), which means only 1 or 2 packets in
 flight at once, meaning that many times you get the 200ms delay
 and your throughput goes way down.

   cheers
   luigi

NetBSD introduced a fix for this recently, it seems sorta hackish, but
maybe we need to do something similar.

The diff reminds me why FreeBSD has a policy of seperating style and
functional commits, fwiw. :)

http://cvsweb.netbsd.org/bsdweb.cgi/syssrc/sys/netinet/tcp_output.c.diff?r1=1.84r2=1.85

Revision 1.85 / (download) - annotate - [select for diffs], Tue Aug 20
16:29:42 2002 UTC (8 weeks ago) by thorpej
Branch: MAIN
CVS Tags: gehenna-devsw-base
Changes since 1.84: +18 -4 lines
Diff to previous 1.84 (colored)

Never send more than half a socket buffer of data.  This insures that
we can always keep 2 packets on the wire, no matter what SO_SNDBUF is,
and therefore ACKs will never be delayed unless we run out of data to
transmit.  The problem is quite easy to tickle when the MTU of the
outgoing interface is larger than the socket buffer size (e.g. loopback).

Fix from Charles Hannum.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Luigi Rizzo

On Tue, Oct 15, 2002 at 08:52:49PM -0500, Mike Silbersack wrote:
...
 NetBSD introduced a fix for this recently, it seems sorta hackish, but
 maybe we need to do something similar.

this helps you if the other side has delayed acks, but halves the
throughput if you are being window limited and the other side does not
use delayed acks (can you force immediate acks by setting the PUSH flag
in the tcp header ?)

cheers
luigi

 
 
http://cvsweb.netbsd.org/bsdweb.cgi/syssrc/sys/netinet/tcp_output.c.diff?r1=1.84r2=1.85
 
 Revision 1.85 / (download) - annotate - [select for diffs], Tue Aug 20
 16:29:42 2002 UTC (8 weeks ago) by thorpej
 Branch: MAIN
 CVS Tags: gehenna-devsw-base
 Changes since 1.84: +18 -4 lines
 Diff to previous 1.84 (colored)
 
 Never send more than half a socket buffer of data.  This insures that
 we can always keep 2 packets on the wire, no matter what SO_SNDBUF is,
 and therefore ACKs will never be delayed unless we run out of data to
 transmit.  The problem is quite easy to tickle when the MTU of the
 outgoing interface is larger than the socket buffer size (e.g. loopback).
 
 Fix from Charles Hannum.
 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: delayed ACK

2002-10-15 Thread Mike Silbersack


On Tue, 15 Oct 2002, Luigi Rizzo wrote:

 On Tue, Oct 15, 2002 at 08:52:49PM -0500, Mike Silbersack wrote:
 ...
  NetBSD introduced a fix for this recently, it seems sorta hackish, but
  maybe we need to do something similar.

 this helps you if the other side has delayed acks, but halves the
 throughput if you are being window limited and the other side does not
 use delayed acks (can you force immediate acks by setting the PUSH flag
 in the tcp header ?)

   cheers
   luigi

I think the comment is slightly misleading, and that it won't actually
cause any performance problems as you suggest.

From what I recall, immediate acking of PUSH packets varies... Linux
appears to have changed back and forth on whether it does so or not.  I
also seem to recall Windows making a change too.  Either way, we probably
shouldn't rely on that behavior alone.

  Never send more than half a socket buffer of data.  This insures that
  we can always keep 2 packets on the wire, no matter what SO_SNDBUF is,
  and therefore ACKs will never be delayed unless we run out of data to
  transmit.  The problem is quite easy to tickle when the MTU of the
  outgoing interface is larger than the socket buffer size (e.g. loopback).

If I'm reading the implementation correctly, what this means is that if
you have a single packet  .5*socketbuffer, you reduce the maximum
*segment* size, causing two smaller packets to be sent instead of one
large packet.  (Smaller still being 8K in size.)

While such a change might help with localhost, I have this sneaky
suspicion that it falls apart when applied to jumbo frames and 32K send
buffers.  Someone well motivated should be able to come up with a more
general heuristic.

Mike Silby Silbersack


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: ENOBUFS

2002-10-15 Thread Petri Helenius


 how large are the packets and how fast is the box ?

Packets go out at an average size of 1024 bytes. The box is dual
P4 Xeon 2400/400 so I think it should qualify as fast ? I disabled
hyperthreading to figure out if it was causing problems. I seem to
be able to send packets at a rate in the 900Mbps when just sending
them out with a process. If I do similar sending on two interfaces at
same time, it tops out at 600Mbps.

The information I´m looking for is how to instrument where the
bottleneck is to either tune the parameters or report a bug in PCI or
em code. (or just simply swap the GE hardware to something that
works better)

Pete


 on a fast box you should be able to generate packets faster than wire
 speed for sizes around 500bytes, meaning that you are going to saturate
 the queue no matter how large it is.

 cheers
 luigi

  em-interface is running 66/64 and is there a way to see interface queue
depth?
  em0: Intel(R) PRO/1000 Network Connection, Version - 1.3.14 port
0x3040-0x307f
  mem 0xfc22-0xfc23 irq 17 at device 3.0 on pci2
  em0:  Speed:1000 Mbps  Duplex:Full
  pcib2: PCI to PCI bridge (vendor=8086 device=1460) at device 29.0 on pci1
  IOAPIC #2 intpin 0 - irq 16
  IOAPIC #2 intpin 6 - irq 17
  IOAPIC #2 intpin 7 - irq 18
  pci2: PCI bus on pcib2
 
  The OS is 4.7-RELEASE.
 
  Pete
 
 
 
  To Unsubscribe: send mail to [EMAIL PROTECTED]
  with unsubscribe freebsd-net in the body of the message




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: ENOBUFS

2002-10-15 Thread Lars Eggert

Petri Helenius wrote:
how large are the packets and how fast is the box ?
 
 
 Packets go out at an average size of 1024 bytes. The box is dual
 P4 Xeon 2400/400 so I think it should qualify as fast ? I disabled
 hyperthreading to figure out if it was causing problems. I seem to
 be able to send packets at a rate in the 900Mbps when just sending
 them out with a process. If I do similar sending on two interfaces at
 same time, it tops out at 600Mbps.

The 900Mbps are similar to what I see here on similar hardware.

For your two-interface setup, are the 600Mbps aggregate send rate on 
both interfaces, or do you see 600Mbps per interface? In the latter 
case, is your CPU maxed out? Only one can be in the kernel under 
-stable, so the second one won't help much. With small packets like 
that, you may be interrupt-bound. (Until Luigi releases polling for em 
interfaces... :-)

Lars
-- 
Lars Eggert [EMAIL PROTECTED]   USC Information Sciences Institute



smime.p7s
Description: S/MIME Cryptographic Signature


Re: ENOBUFS

2002-10-15 Thread Petri Helenius

 The 900Mbps are similar to what I see here on similar hardware.

What kind of receive performance do you observe? I haven´t got that
far yet.

 For your two-interface setup, are the 600Mbps aggregate send rate on
 both interfaces, or do you see 600Mbps per interface? In the latter

600Mbps per interface. I´m going to try this out also on -CURRENT
to see if it changes anything. Interrupts do not seem to pose a big
problem because I´m seeing only a few thousand em interrupts
a second but since every packet involves a write call there are 100k
syscalls a second.

 case, is your CPU maxed out? Only one can be in the kernel under
 -stable, so the second one won't help much. With small packets like
 that, you may be interrupt-bound. (Until Luigi releases polling for em
 interfaces... :-)

I´ll try changing the packet sizes to figure out optimum.

Pete



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message