Re: Cas driver fails to load first time after boot.

2013-01-24 Thread Paul Keusemann


On 01/24/13 15:50, Marius Strobl wrote:

On Thu, Jan 24, 2013 at 12:39:44PM -0600, Paul Keusemann wrote:

On 01/24/13 09:09, Marius Strobl wrote:

On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:

Hi,

I've got a Dell R200 which I'm trying to build into a gateway with a Sun
QGE (501-6738-10).  The cas driver fails to load the first time I try to
load it but succeeds the second time.  Is this a problem with the card,
the driver, my karma?

Wrong phase of the moon, apparently :)
The MII setup of these chips is a bit tricky and I'm not sure whether
I've hit all code paths during development of the driver. I certainly
didn't test with a 501-6738, these have been reported as working before,
though. It also doesn't make much sense that attaching the devices
succeeds on the second attempt. Could you please use a if_cas.ko built
with the attached patch and report the debug output for one of the
interfaces in both the working and the non-working case?

I would love to give you output from the working and non-working case
but apparently the phase of the moon has changed, I can't get it to fail
now.  The messages output from the working case is attached.


Thanks but unfortunately this doesn't make any sense either. In general,
printf()s cause deays which can be relevant. In the locations I've put
them they hardly can make such a difference though.
If you haven't already done so, could you please power off the machine
before doing the test with the patched module? Is the problem still gone
if you revert to the original module?


OK, power-cycling makes a difference.  The driver fails to attach all of 
the devices after power-cycling most of the time if not all of the 
time.  The number of devices attached varies, the attached message file 
fragment is from my last test.  Three of the devices were attached on 
the first load attempt and all four of them on the second attempt.


In the interest of full disclosure, I did build a new kernel but it is 
just a copy of GENERIC.  This is a




Marius




--
Paul Keusemannpkeu...@visi.com
4266 Joppa Court  (952) 894-7805
Savage, MN  55378

Jan 24 20:32:32 lucid kernel: Copyright (c) 1992-2012 The FreeBSD Project.
Jan 24 20:32:32 lucid kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 
1991, 1992, 1993, 1994
Jan 24 20:32:32 lucid kernel: The Regents of the University of California. All 
rights reserved.
Jan 24 20:32:32 lucid kernel: FreeBSD is a registered trademark of The FreeBSD 
Foundation.
Jan 24 20:32:32 lucid kernel: FreeBSD 8.3-RELEASE #0: Thu Jan 24 11:15:13 CST 
2013
Jan 24 20:32:32 lucid kernel: toor@lucid:/usr/obj/usr/src/sys/LUCID amd64
Jan 24 20:32:32 lucid kernel: Timecounter "i8254" frequency 1193182 Hz quality 0
Jan 24 20:32:32 lucid kernel: CPU: Intel(R) Xeon(R) CPU   X3210  @ 
2.13GHz (2133.42-MHz K8-class CPU)
Jan 24 20:32:32 lucid kernel: Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  
Model = f  Stepping = 11
Jan 24 20:32:32 lucid kernel: 
Features=0xbfebfbff
Jan 24 20:32:32 lucid kernel: 
Features2=0xe3bd
Jan 24 20:32:32 lucid kernel: AMD Features=0x20100800
Jan 24 20:32:32 lucid kernel: AMD Features2=0x1
Jan 24 20:32:32 lucid kernel: TSC: P-state invariant
Jan 24 20:32:32 lucid kernel: real memory  = 4294967296 (4096 MB)
Jan 24 20:32:32 lucid kernel: avail memory = 4099231744 (3909 MB)
Jan 24 20:32:32 lucid kernel: ACPI APIC Table: 
Jan 24 20:32:32 lucid kernel: FreeBSD/SMP: Multiprocessor System Detected: 4 
CPUs
Jan 24 20:32:32 lucid kernel: FreeBSD/SMP: 1 package(s) x 4 core(s)
Jan 24 20:32:32 lucid kernel: cpu0 (BSP): APIC ID:  0
Jan 24 20:32:32 lucid kernel: cpu1 (AP): APIC ID:  1
Jan 24 20:32:32 lucid kernel: cpu2 (AP): APIC ID:  2
Jan 24 20:32:32 lucid kernel: cpu3 (AP): APIC ID:  3
Jan 24 20:32:32 lucid kernel: ioapic0: Changing APIC ID to 4
Jan 24 20:32:32 lucid kernel: ioapic1: Changing APIC ID to 5
Jan 24 20:32:32 lucid kernel: ioapic0  irqs 0-23 on motherboard
Jan 24 20:32:32 lucid kernel: ioapic1  irqs 32-55 on motherboard
Jan 24 20:32:32 lucid kernel: kbd1 at kbdmux0
Jan 24 20:32:32 lucid kernel: acpi0:  on motherboard
Jan 24 20:32:32 lucid kernel: acpi0: [ITHREAD]
Jan 24 20:32:32 lucid kernel: acpi0: Power Button (fixed)
Jan 24 20:32:32 lucid kernel: Timecounter "ACPI-fast" frequency 3579545 Hz 
quality 1000
Jan 24 20:32:32 lucid kernel: acpi_timer0: <24-bit timer at 3.579545MHz> port 
0x808-0x80b on acpi0
Jan 24 20:32:32 lucid kernel: cpu0:  on acpi0
Jan 24 20:32:32 lucid kernel: cpu1:  on acpi0
Jan 24 20:32:32 lucid kernel: cpu2:  on acpi0
Jan 24 20:32:32 lucid kernel: cpu3:  on acpi0
Jan 24 20:32:32 lucid kernel: pcib0:  port 0xcf8-0xcff on 
acpi0
Jan 24 20:32:32 lucid kernel: pci0:  on pcib0
Jan 24 20:32:32 lucid kernel: pcib1:  irq 16 at device 1.0 
on pci0
Jan 24 20:32:32 lucid kernel: pci1:  on pcib1
Jan 24 20:32:32 lucid kernel: pcib2:  irq 16 at device 
28.0 on pci0
Jan 24 20:32:32 lucid kernel: pci2:  on pcib2
Jan 24 20:

Re: how to completely makes an interface down?

2013-01-24 Thread Warren Block

On Thu, 24 Jan 2013, h bagade wrote:


I'm searching for a method or configuration which when I make the interface
down, the led goes off. Currently the led still remains on when I shutdowns
the interface! Is there any way to do this?


em(4) mentions controlling the card LEDs.  I have not tried it, though.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Tov?bb?t?s: [Ipsec-tools-users] freebsd & linux setup question

2013-01-24 Thread Richard Kojedzinszky

Dear Yvan,

I've found a strange line in racoon's output:

Either family (2 - 2), types (4 - 1) of ID from initiator differ or 
matching sainfo has no id_i defined for the peer. Not filling iph2->sa_src 
and iph2->sa_dst.


This is missing in linux's instance. Could this be a clue for my problem?

Thanks in advance,

Kojedzinszky Richard

On Tue, 22 Jan 2013, Richard Kojedzinszky wrote:


Dear Yvan,

I've recompiled racoon with NATT, but as you've said, only pure Internet is 
between A and B without NAT, and thus it did not solve my problem.


I've attached racoon's output from
# racoon -ddd -F
on the freebsd's side.

I can confirm, that setkey -D and -DP's output were full, so only the two 
entries existed for the SA's and policices.


I've tried a simple road-warrior setup, with transport mode, thus only 
traffic between A and B was protected, but that worked.

My server's racoon.conf is simple:
--
path certificate "/usr/local/etc/racoon/certs";

remote anonymous {
exchange_mode main,aggressive;
#   nat_traversal off;

certificate_type x509 "A.crt "A.key";
ca_type x509 "ca.crt";
my_identifier asn1dn;
peers_identifier asn1dn;
proposal_check strict ;

lifetime time 24 hour;

proposal {
encryption_algorithm aes256;
hash_algorithm sha1;
authentication_method rsasig;
dh_group 2;
}

generate_policy on ;
passive on ;

dpd_delay 60;
}

sainfo anonymous {
lifetime time 4 hour;

encryption_algorithm aes128 ;
authentication_algorithm hmac_md5 ;
compression_algorithm deflate;
}

log debug ;
--

And the client's is the same except the generate_policy and passive 
statements.


Thanks in advance,

Kojedzinszky Richard

On Tue, 22 Jan 2013, VANHULLEBUS Yvan wrote:


Hi.


On Mon, Jan 21, 2013 at 05:53:49PM +0100, kri...@cflinux.hu wrote:

Dear users,

I've a working tunnel setup between two linux hosts.

One end (A) has a fix address, while the other (B) has a dynamic one.
A is my server, B is my home router. Behind B, I've a private network.
What I've setup is that my private network reaches A through an IPSEC
tunnel.

[]

Now, I've decided to switc to freebsd on server side, and the same
configuration on the server simply does not work. It installs the
policies, and the tunnels, but it seems, that when a reply packet is
leaving the server, it tries to initiate a new tunnel. If I've "passive
on" on my server's remote section, then I've the following error:

Jan 21 16:06:11 pi racoon: ERROR: no configuration found for B.
Jan 21 16:06:11 pi racoon: ERROR: failed to begin ipsec sa negotication.

If I disable passive mode, then racoon tries to establish another tunnel,
but for some reason it does not succeed also. But I think, as in linux
it should work with passive on.

FreeBSD is 9.1-RELEASE, the linux side is a linux 3.5.4.

racoon on linux is:
# racoon -V
@(#)ipsec-tools 0.8.0 (http://ipsec-tools.sourceforge.net)

Compiled with:
- OpenSSL 1.0.0e 6 Sep 2011 (http://www.openssl.org/)
- Dead Peer Detection
- IKE fragmentation
- NAT Traversal
- Monotonic clock


racoon on freebsd is:
# racoon -V
@(#)ipsec-tools 0.8.0 (http://ipsec-tools.sourceforge.net)

Compiled with:
- OpenSSL 0.9.8x 10 May 2012 (http://www.openssl.org/)
- Dead Peer Detection
- IKE fragmentation
- Hybrid authentication
- Monotonic clock


You have NAT-T compiled/enabled on Linux side, but not on FreeBSD side
(probably because it is not activated as a kernel option).
If you have "something that does NAT" on the wire between A and B, it
is probably the origin of your problem.

However, as it seems that there is only "Internet" between A and B,
I'll suppose that the issue is somewhere else...



Unfortunately I've no idea.

Before the first packet, on the server:
# setkey -D
No SAD entries.

After an icmp packet sent from my private network to A:
# setkey -D
A B
esp mode=tunnel spi=76859998(0x0494ca5e) reqid=0(0x)
E: rijndael-cbc  1c80b80d b006e3a3 772c2a9b 5c475213
A: hmac-md5  d43ff29c 034c896a fb2e7d1c 95f73ff5
seq=0x replay=4 flags=0x state=mature
created: Jan 21 17:03:39 2013   current: Jan 21 17:05:54 2013
diff: 135(s)hard: 14400(s)  soft: 11520(s)
last:   hard: 0(s)  soft: 0(s)
current: 0(bytes)   hard: 0(bytes)  soft: 0(bytes)
allocated: 0hard: 0 soft: 0
sadb_seq=1 pid=93091 refcnt=1
B A
esp mode=tunnel spi=14479(0x08a151f0) reqid=0(0x)
E: rijndael-cbc  8bd59c29 9800d10f 8f9d7e84 a720aa9c
A: hmac-md5  188070e2 a3220772 78efcb06 3457db62
seq=0x0037 replay=4 flags=0x state=mature
created: Jan 21 17:03:39 2013   current: Jan 21 17:05:54 2013
diff: 135(s)hard: 14400(s)  soft: 11520(s)
last: Jan 21 17:04:50 2013  hard: 0(s) 

Re: Some questions about the new TCP congestion control code

2013-01-24 Thread Lawrence Stewart
On 01/25/13 01:12, Andre Oppermann wrote:
> On 24.01.2013 14:28, Lawrence Stewart wrote:
>> On 01/16/13 06:27, John Baldwin wrote:
>>> One other thing I noticed which is may or may not be odd during this,
>>> is that
>>> if you have a connection with TCP_NODELAY enabled and you fill your
>>> cwnd and
>>> then you get an ACK back for an earlier small segment (less than
>>> MSS), TCP
>>> will not send out a "short" segment for the amount of window space
>>> released.
>>> Instead, it will wait until a full MSS of space is available before
>>> sending
>>> a packet.  I'm not sure if that is the correct behavior with
>>> TCP_NODELAY or
>>> if we should send "short" segments in that case.
>>
>> We try fairly hard not to send runt segments irrespective of NODELAY,
>> but I would be happy to see that change. I'm not aware of any "correct
>> behaviour" we have to adhere to - I think it would be perfectly
>> reasonable to have a sysctl set the lowest number of bytes we'd be
>> willing to send a runt segment for and then key off TCP_NODELAY as to
>> whether we try hard to send an MSS worth or send as soon as we have the
>> min number of bytes worth of window available.
> 
> This is classic silly window syndrome prevention applied to the CWND.

Yes, but I think we could provide knobs to relax the behaviour where the
latency vs header/payload overhead tradeoff swings in favour of latency.

I guess, John, I should first ask if you know why you were only getting
such small ACKs back? Were you sending full MSS segments in the first
place or doing some sort of PUSH to try and expedite getting some
smaller chunk of data to the other end which triggered a small segment
and corresponding small ACK?

> Sending a small segment when the window opens just a bit isn't going to help
> much and

I wouldn't be game to make such a blanket statement - that very much
depends on the situation. I think John's use case is relevant and we
currently aren't very helpful towards it.

> mostly clogs the network.

How so? We're not in the 80's any more. If I pay for X MBps of service,
I expect to be able to use it in any way I choose. Packet size is
irrelevant, but there are obvious efficiencies to be gained by
maximising the amount of payload in each segment.

> This is actually a side effect of ABC (appropriate byte counting) where not
> the ACK's are counted but the bytes ACK'ed.  Disabling ABC will solve this
> problem.

I don't follow. How is what John described above related to ABC?

Cheers,
Lawrence
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Block ACK in Ralink RT2860

2013-01-24 Thread PseudoCylon
> Message: 6
> Date: Thu, 24 Jan 2013 12:23:55 -0500
> From: Ramanujan Seshadri 
> To: freebsd-net@freebsd.org
> Subject: Block ACK in Ralink RT2860
> Message-ID:
> 
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi all,
> I am trying  to read the contents of block ack's in a Ralink RT2860 driver.
> Can you please help me to know which function i should be looking into ?

At default, all BA packets are dropped by h/w. Clear RT2860_DROP_BA flag at
http://fxr.watson.org/fxr/source/dev/ral/rt2860.c#L3559

Then, the diver should receive BA packets, and you can read them.


AK
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Cas driver fails to load first time after boot.

2013-01-24 Thread Marius Strobl
On Thu, Jan 24, 2013 at 12:39:44PM -0600, Paul Keusemann wrote:
> 
> On 01/24/13 09:09, Marius Strobl wrote:
> > On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:
> >> Hi,
> >>
> >> I've got a Dell R200 which I'm trying to build into a gateway with a Sun
> >> QGE (501-6738-10).  The cas driver fails to load the first time I try to
> >> load it but succeeds the second time.  Is this a problem with the card,
> >> the driver, my karma?
> > Wrong phase of the moon, apparently :)
> > The MII setup of these chips is a bit tricky and I'm not sure whether
> > I've hit all code paths during development of the driver. I certainly
> > didn't test with a 501-6738, these have been reported as working before,
> > though. It also doesn't make much sense that attaching the devices
> > succeeds on the second attempt. Could you please use a if_cas.ko built
> > with the attached patch and report the debug output for one of the
> > interfaces in both the working and the non-working case?
> 
> I would love to give you output from the working and non-working case 
> but apparently the phase of the moon has changed, I can't get it to fail 
> now.  The messages output from the working case is attached.
> 

Thanks but unfortunately this doesn't make any sense either. In general,
printf()s cause deays which can be relevant. In the locations I've put
them they hardly can make such a difference though.
If you haven't already done so, could you please power off the machine
before doing the test with the patched module? Is the problem still gone
if you revert to the original module?

Marius

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: 9.1-stable crashes while copying data from a NFS mounted directory

2013-01-24 Thread Konstantin Belousov
On Thu, Jan 24, 2013 at 09:50:52PM +0100, Christian Gusenbauer wrote:
> On Thursday 24 January 2013 20:37:09 Konstantin Belousov wrote:
> > On Thu, Jan 24, 2013 at 07:50:49PM +0100, Christian Gusenbauer wrote:
> > > On Thursday 24 January 2013 19:07:23 Konstantin Belousov wrote:
> > > > On Thu, Jan 24, 2013 at 08:03:59PM +0200, Konstantin Belousov wrote:
> > > > > On Thu, Jan 24, 2013 at 06:05:57PM +0100, Christian Gusenbauer wrote:
> > > > > > Hi!
> > > > > > 
> > > > > > I'm using 9.1 stable svn revision 245605 and I get the panic below
> > > > > > if I execute the following commands (as single user):
> > > > > > 
> > > > > > # swapon -a
> > > > > > # dumpon /dev/ada0s3b
> > > > > > # mount -u /
> > > > > > # ifconfig age0 inet 192.168.2.2 mtu 6144 up
> > > > > > # mount -t nfs -o rsize=32768 data:/multimedia /mnt
> > > > > > # cp /mnt/Movies/test/a.m2ts /tmp
> > > > > > 
> > > > > > then the system panics almost immediately. I'll attach the stack
> > > > > > trace.
> > > > > > 
> > > > > > Note, that I'm using jumbo frames (6144 byte) on a 1Gbit network,
> > > > > > maybe that's the cause for the panic, because the bcopy (see stack
> > > > > > frame #15) fails.
> > > > > > 
> > > > > > Any clues?
> > > > > 
> > > > > I tried a similar operation with the nfs mount of rsize=32768 and mtu
> > > > > 6144, but the machine runs HEAD and em instead of age. I was unable
> > > > > to reproduce the panic on the copy of the 5GB file from nfs mount.
> > > 
> > > Hmmm, I did a quick test. If I do not change the MTU, so just configuring
> > > age0 with
> > > 
> > > # ifconfig age0 inet 192.168.2.2 up
> > > 
> > > then I can copy all files from the mounted directory without any
> > > problems, too. So it's probably age0 related?
> > 
> > From your backtrace and the buffer printout, I see somewhat strange thing.
> > The buffer data address is 0xff8171418000, while kernel faulted
> > at the attempt to write at 0xff8171413000, which is is lower then
> > the buffer data pointer, at the attempt to bcopy to the buffer.
> > 
> > The other data suggests that there were no overflow of the data from the
> > server response. So it might be that mbuf_len(mp) returned negative number
> > ? I am not sure is it possible at all.
> > 
> > Try this debugging patch, please. You need to add INVARIANTS etc to the
> > kernel config.
> > 
> > diff --git a/sys/fs/nfs/nfs_commonsubs.c b/sys/fs/nfs/nfs_commonsubs.c
> > index efc0786..9a6bda5 100644
> > --- a/sys/fs/nfs/nfs_commonsubs.c
> > +++ b/sys/fs/nfs/nfs_commonsubs.c
> > @@ -218,6 +218,7 @@ nfsm_mbufuio(struct nfsrv_descript *nd, struct uio
> > *uiop, int siz) }
> > mbufcp = NFSMTOD(mp, caddr_t);
> > len = mbuf_len(mp);
> > +   KASSERT(len > 0, ("len %d", len));
> > }
> > xfer = (left > len) ? len : left;
> >  #ifdef notdef
> > @@ -239,6 +240,8 @@ nfsm_mbufuio(struct nfsrv_descript *nd, struct uio
> > *uiop, int siz) uiop->uio_resid -= xfer;
> > }
> > if (uiop->uio_iov->iov_len <= siz) {
> > +   KASSERT(uiop->uio_iovcnt > 1, ("uio_iovcnt %d",
> > +   uiop->uio_iovcnt));
> > uiop->uio_iovcnt--;
> > uiop->uio_iov++;
> > } else {
> > 
> > I thought that server have returned too long response, but it seems to
> > be not the case from your data. Still, I think the patch below might be
> > due.
> > 
> > diff --git a/sys/fs/nfsclient/nfs_clrpcops.c
> > b/sys/fs/nfsclient/nfs_clrpcops.c index be0476a..a89b907 100644
> > --- a/sys/fs/nfsclient/nfs_clrpcops.c
> > +++ b/sys/fs/nfsclient/nfs_clrpcops.c
> > @@ -1444,7 +1444,7 @@ nfsrpc_readrpc(vnode_t vp, struct uio *uiop, struct
> > ucred *cred, NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
> > eof = fxdr_unsigned(int, *tl);
> > }
> > -   NFSM_STRSIZ(retlen, rsize);
> > +   NFSM_STRSIZ(retlen, len);
> > error = nfsm_mbufuio(nd, uiop, retlen);
> > if (error)
> > goto nfsmout;
> 
> I applied your patches and now I get a
> 
> panic: len -4
> cpuid = 1
> KDB: enter: panic
> Dumping 377 out of 6116 MB:..5%..13%..22%..34%..43%..51%..64%..73%..81%..94%
> 
This means that the age driver either produced corrupted mbuf chain,
or filled wrong negative value into the mbuf len field. I am quite
certain that the issue is in the driver.

I added the net@ to Cc:, hopefully you could get help there.
> 
> #0  doadump (textdump=0)
> at /spare/tmp/src-stable9/sys/kern/kern_shutdown.c:265
> 265 if (textdump && textdump_pending) {
> (kgdb) #0  doadump (textdump=0)
> at /spare/tmp/src-stable9/sys/kern/kern_shutdown.c:265
> #1  0x802a7490 in db_dump (dummy=,
> dummy2=, dummy3=,
> dummy4=)
> at /spare/tmp/src-stable9/sys/ddb/db_command.c:538
> #2  0x802a6a7e in db_command (last_cmdp=0x808ca140

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-24 Thread Alfred Perlstein

On 1/24/13 11:14 AM, John Baldwin wrote:

On Thursday, January 24, 2013 3:03:31 am Andre Oppermann wrote:

On 24.01.2013 03:31, Sepherosa Ziehau wrote:

On Thu, Jan 24, 2013 at 12:15 AM, John Baldwin  wrote:

On Wednesday, January 23, 2013 1:33:27 am Sepherosa Ziehau wrote:

On Wed, Jan 23, 2013 at 4:11 AM, John Baldwin  wrote:

As I mentioned in an earlier thread, I recently had to debug an issue we were
seeing across a link with a high bandwidth-delay product (both high bandwidth
and high RTT).  Our specific use case was to use a TCP connection to reliably
forward a latency-sensitive datagram stream across a WAN connection.  We would
often see spikes in the latency of individual datagrams.  I eventually tracked
this down to the connection entering slow start when it would transmit data
after being idle.  The data stream was quite bursty and would often attempt to
transmit a burst of data after being idle for far longer than a retransmit
timeout.

In 7.x we had worked around this in the past by disabling RFC 3390 and jacking
the slow start window size up via a sysctl.  On 8.x this no longer worked.
The solution I came up with was to add a new socket option to disable idle
handling completely.  That is, when an idle connection restarts with this new
option enabled, it keeps its current congestion window and doesn't enter slow
start.

There are only a few cases where such an option is useful, but if anyone else
thinks this might be useful I'd be happy to add the option to FreeBSD.

I think what you need is the RFC2861, however, you probably should
ignore the "application-limited period" part of RFC2861.

Hummm.  It appears btw, that Linux uses RFC 2861, but has a global knob to
disable it due to applictions having problems.  When it is disabled,
it doesn't decay the congestion window at all during idle handling.  That is,
it appears to act the same as if TCP_IGNOREIDLE were enabled.

  From http://www.kernel.org/doc/man-pages/online/pages/man7/tcp.7.html:

 tcp_slow_start_after_idle (Boolean; default: enabled; since Linux 
2.6.18)
If enabled, provide RFC 2861 behavior and time out the 
congestion
window after an idle period.  An idle period is defined as the 
current
RTO (retransmission timeout).  If disabled, the congestion 
window will
not be timed out after an idle period.

Also, in this thread on tcp-m it appears no one on that list realizes that
there are any implementations which follow the "SHOULD" in RFC 2581 for idle
handling (which is what we do currently):

Nah, I don't think the idle detection in FreeBSD follows the
RFC2581/RFC5681 4.1 (the paragraph before the "SHOULD").  IMHO, that's
probably why the author in the following email requestioned about the
implementation of "SHOULD" in RFC2581/RFC5681.


http://www.ietf.org/mail-archive/web/tcpm/current/msg02864.html

So if we were to implement RFC 2861, the new socket option would be equivalent
to setting Linux's 'tcp_slow_start_after_idle' to false, but on a per-socket
basis rather than globally.

Agree, per-socket option could be useful than global sysctls under
certain situation.  However, in addition to the per-socket option,
could global sysctl nodes to disable idle_restart/idle_cwv help too?

No.  This is far too dangerous once it makes it into some tuning guide.
The threat of congestion breakdown is real.  The Internet, or any packet
network, can only survive in the long term if almost all follow the rules
and self-constrain to remain fair to the others.  What would happen if
nobody would respect the traffic lights anymore?

The problem with this argument is Linux has already had this as a tunable
option for years and the Internet hasn't melted as a result.
  

Besides that bursting into unknown network conditions is very likely to
result in burst losses as well.  TCP isn't good at recovering from it.
In the end you most likely come out ahead if you decay the restartCWND.

We have two cases primarily: a) long distance, medium to high RTT, and
wildly varying bandwidth (a.k.a. the Internet); b) short distance, low
RTT and mostly plenty of bandwidth (a.k.a. Datacenter).  The former
absolutely definately requires a decayed restartCWND.  The latter less
so but even there bursting at 10Gig TSO assisted wirespeed isn't going
to end too happy more often than not.

You forgot my case: c) dedicated long distance links with high bandwidth.


Since this seems to be a burning issue I'll come up with a patch in the
next days to add a decaying restartCWND that'll be fair and allow a very
quick ramp up if no loss occurs.

I think this could be useful.  OTOH, I still think the TCP_IGNOREIDLE option
is useful both with and without a decaying restartCWND?

Linux seems to be doing just fine with it for what seems to be a long 
while.  Can we get this committed?


-Alfred
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-24 Thread John Baldwin
On Thursday, January 24, 2013 3:03:31 am Andre Oppermann wrote:
> On 24.01.2013 03:31, Sepherosa Ziehau wrote:
> > On Thu, Jan 24, 2013 at 12:15 AM, John Baldwin  wrote:
> >> On Wednesday, January 23, 2013 1:33:27 am Sepherosa Ziehau wrote:
> >>> On Wed, Jan 23, 2013 at 4:11 AM, John Baldwin  wrote:
>  As I mentioned in an earlier thread, I recently had to debug an issue we 
>  were
>  seeing across a link with a high bandwidth-delay product (both high 
>  bandwidth
>  and high RTT).  Our specific use case was to use a TCP connection to 
>  reliably
>  forward a latency-sensitive datagram stream across a WAN connection.  We 
>  would
>  often see spikes in the latency of individual datagrams.  I eventually 
>  tracked
>  this down to the connection entering slow start when it would transmit 
>  data
>  after being idle.  The data stream was quite bursty and would often 
>  attempt to
>  transmit a burst of data after being idle for far longer than a 
>  retransmit
>  timeout.
> 
>  In 7.x we had worked around this in the past by disabling RFC 3390 and 
>  jacking
>  the slow start window size up via a sysctl.  On 8.x this no longer 
>  worked.
>  The solution I came up with was to add a new socket option to disable 
>  idle
>  handling completely.  That is, when an idle connection restarts with 
>  this new
>  option enabled, it keeps its current congestion window and doesn't enter 
>  slow
>  start.
> 
>  There are only a few cases where such an option is useful, but if anyone 
>  else
>  thinks this might be useful I'd be happy to add the option to FreeBSD.
> >>>
> >>> I think what you need is the RFC2861, however, you probably should
> >>> ignore the "application-limited period" part of RFC2861.
> >>
> >> Hummm.  It appears btw, that Linux uses RFC 2861, but has a global knob to
> >> disable it due to applictions having problems.  When it is disabled,
> >> it doesn't decay the congestion window at all during idle handling.  That 
> >> is,
> >> it appears to act the same as if TCP_IGNOREIDLE were enabled.
> >>
> >>  From http://www.kernel.org/doc/man-pages/online/pages/man7/tcp.7.html:
> >>
> >> tcp_slow_start_after_idle (Boolean; default: enabled; since Linux 
> >> 2.6.18)
> >>If enabled, provide RFC 2861 behavior and time out the 
> >> congestion
> >>window after an idle period.  An idle period is defined as 
> >> the current
> >>RTO (retransmission timeout).  If disabled, the congestion 
> >> window will
> >>not be timed out after an idle period.
> >>
> >> Also, in this thread on tcp-m it appears no one on that list realizes that
> >> there are any implementations which follow the "SHOULD" in RFC 2581 for 
> >> idle
> >> handling (which is what we do currently):
> >
> > Nah, I don't think the idle detection in FreeBSD follows the
> > RFC2581/RFC5681 4.1 (the paragraph before the "SHOULD").  IMHO, that's
> > probably why the author in the following email requestioned about the
> > implementation of "SHOULD" in RFC2581/RFC5681.
> >
> >>
> >> http://www.ietf.org/mail-archive/web/tcpm/current/msg02864.html
> >>
> >> So if we were to implement RFC 2861, the new socket option would be 
> >> equivalent
> >> to setting Linux's 'tcp_slow_start_after_idle' to false, but on a 
> >> per-socket
> >> basis rather than globally.
> >
> > Agree, per-socket option could be useful than global sysctls under
> > certain situation.  However, in addition to the per-socket option,
> > could global sysctl nodes to disable idle_restart/idle_cwv help too?
> 
> No.  This is far too dangerous once it makes it into some tuning guide.
> The threat of congestion breakdown is real.  The Internet, or any packet
> network, can only survive in the long term if almost all follow the rules
> and self-constrain to remain fair to the others.  What would happen if
> nobody would respect the traffic lights anymore?

The problem with this argument is Linux has already had this as a tunable
option for years and the Internet hasn't melted as a result.
 
> Besides that bursting into unknown network conditions is very likely to
> result in burst losses as well.  TCP isn't good at recovering from it.
> In the end you most likely come out ahead if you decay the restartCWND.
> 
> We have two cases primarily: a) long distance, medium to high RTT, and
> wildly varying bandwidth (a.k.a. the Internet); b) short distance, low
> RTT and mostly plenty of bandwidth (a.k.a. Datacenter).  The former
> absolutely definately requires a decayed restartCWND.  The latter less
> so but even there bursting at 10Gig TSO assisted wirespeed isn't going
> to end too happy more often than not.

You forgot my case: c) dedicated long distance links with high bandwidth.

> Since this seems to be a burning issue I'll come up with a patch in the
> next days to add a decaying r

Re: how to completely makes an interface down?

2013-01-24 Thread John-Mark Gurney
h bagade wrote this message on Thu, Jan 24, 2013 at 16:59 +0330:
> I'm searching for a method or configuration which when I make the interface
> down, the led goes off. Currently the led still remains on when I shutdowns
> the interface! Is there any way to do this?

Not all ethernet drivers disable the PHY when you down the interface...
You can try to use:
ifconfig  media none

to shutdown the PHY, but the em driver on 9.1 doesn't have it, but re
(7.2-R and -current) and msk (-current) seems to have it...

Also, why do you want the led to go off?  Remeber, the led is just an
indication if there is a link established, not what will happen to the
packets that are received...

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 "All that I will do, has been done, All that I have, has not."
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Cas driver fails to load first time after boot.

2013-01-24 Thread Paul Keusemann


On 01/24/13 09:09, Marius Strobl wrote:

On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:

Hi,

I've got a Dell R200 which I'm trying to build into a gateway with a Sun
QGE (501-6738-10).  The cas driver fails to load the first time I try to
load it but succeeds the second time.  Is this a problem with the card,
the driver, my karma?

Wrong phase of the moon, apparently :)
The MII setup of these chips is a bit tricky and I'm not sure whether
I've hit all code paths during development of the driver. I certainly
didn't test with a 501-6738, these have been reported as working before,
though. It also doesn't make much sense that attaching the devices
succeeds on the second attempt. Could you please use a if_cas.ko built
with the attached patch and report the debug output for one of the
interfaces in both the working and the non-working case?


I would love to give you output from the working and non-working case 
but apparently the phase of the moon has changed, I can't get it to fail 
now.  The messages output from the working case is attached.


Let me know if there's anything else I can do.


Marius



--
Paul Keusemannpkeu...@visi.com
4266 Joppa Court  (952) 894-7805
Savage, MN  55378

Jan 24 11:00:01 lucid newsyslog[2087]: logfile turned over due to size>100K
Jan 24 11:47:39 lucid shutdown: reboot by toor: 
Jan 24 11:47:41 lucid syslogd: exiting on signal 15
Jan 24 11:48:51 lucid syslogd: kernel boot file is /boot/kernel/kernel
Jan 24 11:48:51 lucid kernel: Copyright (c) 1992-2012 The FreeBSD Project.
Jan 24 11:48:51 lucid kernel: Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 
1991, 1992, 1993, 1994
Jan 24 11:48:51 lucid kernel: The Regents of the University of California. All 
rights reserved.
Jan 24 11:48:51 lucid kernel: FreeBSD is a registered trademark of The FreeBSD 
Foundation.
Jan 24 11:48:51 lucid kernel: FreeBSD 8.3-RELEASE #0: Thu Jan 24 11:15:13 CST 
2013
Jan 24 11:48:51 lucid kernel: toor@lucid:/usr/obj/usr/src/sys/LUCID amd64
Jan 24 11:48:51 lucid kernel: Timecounter "i8254" frequency 1193182 Hz quality 0
Jan 24 11:48:51 lucid kernel: CPU: Intel(R) Xeon(R) CPU   X3210  @ 
2.13GHz (2133.42-MHz K8-class CPU)
Jan 24 11:48:51 lucid kernel: Origin = "GenuineIntel"  Id = 0x6fb  Family = 6  
Model = f  Stepping = 11
Jan 24 11:48:51 lucid kernel: 
Features=0xbfebfbff
Jan 24 11:48:51 lucid kernel: 
Features2=0xe3bd
Jan 24 11:48:51 lucid kernel: AMD Features=0x20100800
Jan 24 11:48:51 lucid kernel: AMD Features2=0x1
Jan 24 11:48:51 lucid kernel: TSC: P-state invariant
Jan 24 11:48:51 lucid kernel: real memory  = 4294967296 (4096 MB)
Jan 24 11:48:51 lucid kernel: avail memory = 4099231744 (3909 MB)
Jan 24 11:48:51 lucid kernel: ACPI APIC Table: 
Jan 24 11:48:51 lucid kernel: FreeBSD/SMP: Multiprocessor System Detected: 4 
CPUs
Jan 24 11:48:51 lucid kernel: FreeBSD/SMP: 1 package(s) x 4 core(s)
Jan 24 11:48:51 lucid kernel: cpu0 (BSP): APIC ID:  0
Jan 24 11:48:51 lucid kernel: cpu1 (AP): APIC ID:  1
Jan 24 11:48:51 lucid kernel: cpu2 (AP): APIC ID:  2
Jan 24 11:48:51 lucid kernel: cpu3 (AP): APIC ID:  3
Jan 24 11:48:51 lucid kernel: ioapic0: Changing APIC ID to 4
Jan 24 11:48:51 lucid kernel: ioapic1: Changing APIC ID to 5
Jan 24 11:48:51 lucid kernel: ioapic0  irqs 0-23 on motherboard
Jan 24 11:48:51 lucid kernel: ioapic1  irqs 32-55 on motherboard
Jan 24 11:48:51 lucid kernel: kbd1 at kbdmux0
Jan 24 11:48:51 lucid kernel: acpi0:  on motherboard
Jan 24 11:48:51 lucid kernel: acpi0: [ITHREAD]
Jan 24 11:48:51 lucid kernel: acpi0: Power Button (fixed)
Jan 24 11:48:51 lucid kernel: Timecounter "ACPI-fast" frequency 3579545 Hz 
quality 1000
Jan 24 11:48:51 lucid kernel: acpi_timer0: <24-bit timer at 3.579545MHz> port 
0x808-0x80b on acpi0
Jan 24 11:48:51 lucid kernel: cpu0:  on acpi0
Jan 24 11:48:51 lucid kernel: cpu1:  on acpi0
Jan 24 11:48:51 lucid kernel: cpu2:  on acpi0
Jan 24 11:48:51 lucid kernel: cpu3:  on acpi0
Jan 24 11:48:51 lucid kernel: pcib0:  port 0xcf8-0xcff on 
acpi0
Jan 24 11:48:51 lucid kernel: pci0:  on pcib0
Jan 24 11:48:51 lucid kernel: pcib1:  irq 16 at device 1.0 
on pci0
Jan 24 11:48:51 lucid kernel: pci1:  on pcib1
Jan 24 11:48:51 lucid kernel: pcib2:  irq 16 at device 
28.0 on pci0
Jan 24 11:48:51 lucid kernel: pci2:  on pcib2
Jan 24 11:48:51 lucid kernel: pcib3:  at device 0.0 on pci2
Jan 24 11:48:51 lucid kernel: pci3:  on pcib3
Jan 24 11:48:51 lucid kernel: pcib4:  at device 2.0 on pci3
Jan 24 11:48:51 lucid kernel: pci4:  on pcib4
Jan 24 11:48:51 lucid kernel: pci4:  at device 0.0 (no 
driver attached)
Jan 24 11:48:51 lucid kernel: pci4:  at device 1.0 (no 
driver attached)
Jan 24 11:48:51 lucid kernel: pci4:  at device 2.0 (no 
driver attached)
Jan 24 11:48:51 lucid kernel: pci4:  at device 3.0 (no 
driver attached)
Jan 24 11:48:51 lucid kernel: pcib5:  irq 16 at device 
28.4 on pci0
Jan 24 11:48:51 lucid kernel: pci5:  on pcib5
Jan 24 11:48:51 lucid kernel: bge0:  mem 0xd

Re: how to completely makes an interface down?

2013-01-24 Thread Kevin Oberman
On Thu, Jan 24, 2013 at 5:29 AM, h bagade  wrote:
> Hi all,
>
> I'm searching for a method or configuration which when I make the interface
> down, the led goes off. Currently the led still remains on when I shutdowns
> the interface! Is there any way to do this?

Depends on the interface, but on many devices the only way to turn off
the LED is to unplug the cable or turn off the device in the other
end. The LED is lit by the power on the receive pair and the LED will
remain on even if the system is turned off and the power cord pulled
as the remote end is really lighting the LED.
-- 
R. Kevin Oberman, Network Engineer
E-mail: kob6...@gmail.com
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Carp configuration errors

2013-01-24 Thread Ask Bjørn Hansen
Hello,

After upgrading to 9.1 it seems like carp doesn't pay attention to advskew 
anymore.  I have two boxes each setup with carp0 and carp1; the intention is 
that in regular operation proxy1 is master for carp0 and proxy2 for carp1. 
However, whichever box comes up second is BACKUP for both.

To make IPv6 CARP work I am using the patch from 
http://www.freebsd.org/cgi/query-pr.cgi?pr=127050

I don't know if it is related, but when booting I see a lot of messages like

ifa_del_loopback_route: deletion failed
ifa_add_loopback_route: insertion failed
ifa_del_loopback_route: deletion failed
ifa_add_loopback_route: insertion failed
ifa_del_loopback_route: deletion failed
ifa_add_loopback_route: insertion failed
ifa_del_loopback_route: deletion failed
ifa_add_loopback_route: insertion failed

in dmesg.

I am including my rc.conf files for each host below. Any hints or suggestions 
will be appreciated.


Ask

# proxy1

sshd_enable="YES"
ntpd_enable="YES"
ntpd_flags="-p /var/run/ntpd.pid -f /etc/ntp/ntpd.drift -g"

hostname="proxy1.dev"

ifconfig_vr0="inet 10.0.100.31/24"
ifconfig_vr2="inet 207.171.7.31/24"
ifconfig_vr2_ipv6="inet6 2607:f238:3::1:1/64"

ifconfig_carp0="vhid 40 advskew 50 pass y4t8gwtgjkq4g 207.171.7.40"
ipv4_addrs_carp0="207.171.7.41-49/24"
ifconfig_carp0_ipv6="inet6 2607:f238:3::1:41/64"
ifconfig_carp0_alias0="inet6 2607:f238:3::1:40/64"
ifconfig_carp0_alias1="inet6 2607:f238:3::1:42/64"
ifconfig_carp0_alias2="inet6 2607:f238:3::1:43/64"
ifconfig_carp0_alias3="inet6 2607:f238:3::1:44/64"
ifconfig_carp0_alias4="inet6 2607:f238:3::1:45/64"
ifconfig_carp0_alias5="inet6 2607:f238:3::1:46/64"
ifconfig_carp0_alias6="inet6 2607:f238:3::1:47/64"
ifconfig_carp0_alias7="inet6 2607:f238:3::1:48/64"
ifconfig_carp0_alias8="inet6 2607:f238:3::1:49/64"

ifconfig_carp1="vhid 50 advskew 250 pass hsjrthvruwybwt 207.171.7.50"
ipv4_addrs_carp1="207.171.7.51-59/24"
ifconfig_carp1_ipv6="inet6 2607:f238:3::1:51/64"
ifconfig_carp1_alias0="inet6 2607:f238:3::1:50/64"
ifconfig_carp1_alias1="inet6 2607:f238:3::1:52/64"
ifconfig_carp1_alias2="inet6 2607:f238:3::1:53/64"
ifconfig_carp1_alias3="inet6 2607:f238:3::1:54/64"
ifconfig_carp1_alias4="inet6 2607:f238:3::1:55/64"
ifconfig_carp1_alias5="inet6 2607:f238:3::1:56/64"
ifconfig_carp1_alias6="inet6 2607:f238:3::1:57/64"
ifconfig_carp1_alias7="inet6 2607:f238:3::1:58/64"
ifconfig_carp1_alias8="inet6 2607:f238:3::1:59/64"

ifconfig_vr1="down"

defaultrouter="207.171.7.1"
ipv6_defaultrouter="2607:F238:3::1"

ifconfig_lo0_alias0="inet 127.0.0.2"
ifconfig_lo0_alias1="inet 127.0.0.3"

cloned_interfaces="carp0 carp1"

static_routes="${static_routes} vpn"
route_vpn="-net 10.0.0.0/16 10.0.100.1"

pf_enable="NO"
pflog_enable="NO"

haproxy_enable="YES"
haproxy_config="/etc/haproxy.conf"



###

# proxy2

sshd_enable="YES"
ntpd_enable="YES"
ntpd_flags="-p /var/run/ntpd.pid -f /etc/ntp/ntpd.drift -g"

hostname="proxy2.dev"

ifconfig_vr0="inet 10.0.100.32/24"
ifconfig_vr2="inet 207.171.7.32/24"
ifconfig_vr2_ipv6="inet6 2607:f238:3::1:2/64"

ifconfig_carp0="vhid 40 advskew 150 pass y4t8gwtgjkq4g 207.171.7.40"
ipv4_addrs_carp0="207.171.7.41-49/24"
ifconfig_carp0_ipv6="inet6 2607:f238:3::1:41/64"
ifconfig_carp0_alias0="inet6 2607:f238:3::1:40/64"
ifconfig_carp0_alias1="inet6 2607:f238:3::1:42/64"
ifconfig_carp0_alias2="inet6 2607:f238:3::1:43/64"
ifconfig_carp0_alias3="inet6 2607:f238:3::1:44/64"
ifconfig_carp0_alias4="inet6 2607:f238:3::1:45/64"
ifconfig_carp0_alias5="inet6 2607:f238:3::1:46/64"
ifconfig_carp0_alias6="inet6 2607:f238:3::1:47/64"
ifconfig_carp0_alias7="inet6 2607:f238:3::1:48/64"
ifconfig_carp0_alias8="inet6 2607:f238:3::1:49/64"

ifconfig_carp1="vhid 50 advskew 100 pass hsjrthvruwybwt 207.171.7.50"
ipv4_addrs_carp1="207.171.7.51-59/24"
ifconfig_carp1_ipv6="inet6 2607:f238:3::1:51/64"
ifconfig_carp1_alias0="inet6 2607:f238:3::1:50/64"
ifconfig_carp1_alias1="inet6 2607:f238:3::1:52/64"
ifconfig_carp1_alias2="inet6 2607:f238:3::1:53/64"
ifconfig_carp1_alias3="inet6 2607:f238:3::1:54/64"
ifconfig_carp1_alias4="inet6 2607:f238:3::1:55/64"
ifconfig_carp1_alias5="inet6 2607:f238:3::1:56/64"
ifconfig_carp1_alias6="inet6 2607:f238:3::1:57/64"
ifconfig_carp1_alias7="inet6 2607:f238:3::1:58/64"
ifconfig_carp1_alias8="inet6 2607:f238:3::1:59/64"

ifconfig_vr1="down"

defaultrouter="207.171.7.1"
ipv6_defaultrouter="2607:F238:3::1"

ifconfig_lo0_alias0="inet 127.0.0.2"
ifconfig_lo0_alias1="inet 127.0.0.3"

cloned_interfaces="carp0 carp1"

static_routes="${static_routes} vpn"
route_vpn="-net 10.0.0.0/16 10.0.100.1"

pf_enable="NO"
pflog_enable="NO"

haproxy_enable="YES"
haproxy_config="/etc/haproxy.conf"

svscan_enable="NO"
svscan_servicedir="/etc/svscan"


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Block ACK in Ralink RT2860

2013-01-24 Thread Ramanujan Seshadri
Hi all,
I am trying  to read the contents of block ack's in a Ralink RT2860 driver.
Can you please help me to know which function i should be looking into ?

Thanks
ram
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Cas driver fails to load first time after boot.

2013-01-24 Thread Marius Strobl
On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote:
> Hi,
> 
> I've got a Dell R200 which I'm trying to build into a gateway with a Sun 
> QGE (501-6738-10).  The cas driver fails to load the first time I try to 
> load it but succeeds the second time.  Is this a problem with the card, 
> the driver, my karma?

Wrong phase of the moon, apparently :)
The MII setup of these chips is a bit tricky and I'm not sure whether
I've hit all code paths during development of the driver. I certainly
didn't test with a 501-6738, these have been reported as working before,
though. It also doesn't make much sense that attaching the devices
succeeds on the second attempt. Could you please use a if_cas.ko built
with the attached patch and report the debug output for one of the
interfaces in both the working and the non-working case?

Marius

Index: if_cas.c
===
--- if_cas.c	(revision 245046)
+++ if_cas.c	(working copy)
@@ -332,6 +332,8 @@ cas_attach(struct cas_softc *sc)
 		 */
 		error = ENXIO;
 		v = CAS_READ_4(sc, CAS_MIF_CONF);
+device_printf(sc->sc_dev, "MIF=0x%x PCFG=0x%x\n", v,
+CAS_READ_4(sc, CAS_SATURN_PCFG));
 		if ((v & CAS_MIF_CONF_MDI1) != 0) {
 			v |= CAS_MIF_CONF_PHY_SELECT;
 			CAS_WRITE_4(sc, CAS_MIF_CONF, v);
@@ -347,6 +349,8 @@ cas_attach(struct cas_softc *sc)
 			error = mii_attach(sc->sc_dev, &sc->sc_miibus, ifp,
 			cas_mediachange, cas_mediastatus, BMSR_DEFCAPMASK,
 			MII_PHY_ANY, MII_OFFSET_ANY, MIIF_DOPAUSE);
+if (error == 0)
+device_printf(sc->sc_dev, "external PHY\n");
 		}
 		/*
 		 * Fall back on an internal PHY if no external PHY was found.
@@ -367,6 +371,8 @@ cas_attach(struct cas_softc *sc)
 			error = mii_attach(sc->sc_dev, &sc->sc_miibus, ifp,
 			cas_mediachange, cas_mediastatus, BMSR_DEFCAPMASK,
 			MII_PHY_ANY, MII_OFFSET_ANY, MIIF_DOPAUSE);
+if (error == 0)
+device_printf(sc->sc_dev, "internal PHY\n");
 		}
 	} else {
 		/*
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Some questions about the new TCP congestion control code

2013-01-24 Thread Andre Oppermann

On 24.01.2013 14:28, Lawrence Stewart wrote:

On 01/16/13 06:27, John Baldwin wrote:

One other thing I noticed which is may or may not be odd during this, is that
if you have a connection with TCP_NODELAY enabled and you fill your cwnd and
then you get an ACK back for an earlier small segment (less than MSS), TCP
will not send out a "short" segment for the amount of window space released.
Instead, it will wait until a full MSS of space is available before sending
a packet.  I'm not sure if that is the correct behavior with TCP_NODELAY or
if we should send "short" segments in that case.


We try fairly hard not to send runt segments irrespective of NODELAY,
but I would be happy to see that change. I'm not aware of any "correct
behaviour" we have to adhere to - I think it would be perfectly
reasonable to have a sysctl set the lowest number of bytes we'd be
willing to send a runt segment for and then key off TCP_NODELAY as to
whether we try hard to send an MSS worth or send as soon as we have the
min number of bytes worth of window available.


This is classic silly window syndrome prevention applied to the CWND.  Sending
a small segment when the window opens just a bit isn't going to help much and
mostly clogs the network.

This is actually a side effect of ABC (appropriate byte counting) where not
the ACK's are counted but the bytes ACK'ed.  Disabling ABC will solve this
problem.

--
Andre

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


how to completely makes an interface down?

2013-01-24 Thread h bagade
Hi all,

I'm searching for a method or configuration which when I make the interface
down, the led goes off. Currently the led still remains on when I shutdowns
the interface! Is there any way to do this?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Some questions about the new TCP congestion control code

2013-01-24 Thread Lawrence Stewart
On 01/16/13 06:27, John Baldwin wrote:
> On Tuesday, January 15, 2013 3:29:51 am Lawrence Stewart wrote:
>> Hi John,
>>
>> On 01/15/13 08:04, John Baldwin wrote:
>>> I was looking at TCP congestion control at work recently and noticed a few 
>>
>> Poor you ;)
>>
>>> "odd" things in the current code.  First, there is this chunk of code in 
>>> cc_ack_received() in tcp_input.c:
>>>
>>> static void inline
>>> cc_ack_received(struct tcpcb *tp, struct tcphdr *th, uint16_t type)
>>> {
>>> INP_WLOCK_ASSERT(tp->t_inpcb);
>>>
>>> tp->ccv->bytes_this_ack = BYTES_THIS_ACK(tp, th);
>>> if (tp->snd_cwnd == min(tp->snd_cwnd, tp->snd_wnd))
>>> tp->ccv->flags |= CCF_CWND_LIMITED;
>>> else
>>> tp->ccv->flags &= ~CCF_CWND_LIMITED;
>>>
>>>
>>> Due to hysterical raisins, snd_cwnd and snd_wnd are u_long values, not 
>>> integers, so the call to min() results in truncation on 64-bit hosts.
>>
>> Good catch, but I don't think it matters in practice as neither snd_cwnd
>> or snd_wnd will grow past the 32-bit boundary.
> 
> I have a psyhcotic case using cc_cubic where it seems to grow without bound,
> though that is a bug in and of itself (and this change did not fix that
> issue).  I ended up not using cc_cubic (more below) and haven't been able
> to track down the root cause of the delay.  I can probably provide a test case
> to reproduce this if you are interested.

hmm I'd certainly be interested in hearing more about this issue with
cubic. If you think a test case is easy to come up with, please shoot it
through to me when you have the chance.

>>> It should probably be ulmin() instead.  However, this line seems to be a 
>>> really 
>>> obfuscated way to just write:
>>>
>>> if (tp->snd_cwnd <= tp->snd_wnd)
>>
>> You are correct, though I'd argue the meaning of the existing code as
>> written is clearer compared to your suggested change.
>>
>>> If that is correct, I would vote for changing this to use the much simpler 
>>> logic.
>>
>> Agreed. While I find the existing code slightly clearer in meaning, it's
>> not significant enough to warrant keeping it as is when your suggested
>> change is simpler, fixes a bug and achieves the same thing. Happy for
>> you to change it or I can do it if you prefer.
> 
> I'll leave that to you, thanks.

Committed as r245783.

>>> Secondly, in the particular case I was investigating at work (restart of an 
>>> idle connnection), the newreno congestion control code in 8.x and later 
>>> uses a 
>>> different algorithm than in 7.  Specifically, in 7 TCP would reuse the same 
>>> logic used for an initial cwnd (honoring ss_fltsz).  In 8 this no longer 
>>> happens (instead, 2 is hardcoded).  A guess at a possible fix might look 
>>> something like this:
>>>
>>> Index: cc_newreno.c
>>> ===
>>> --- cc_newreno.c(revision 243660)
>>> +++ cc_newreno.c(working copy)
>>> @@ -169,8 +169,21 @@ newreno_after_idle(struct cc_var *ccv)
>>> if (V_tcp_do_rfc3390)
>>> rw = min(4 * CCV(ccv, t_maxseg),
>>> max(2 * CCV(ccv, t_maxseg), 4380));
>>> +#if 1
>>> else
>>> rw = CCV(ccv, t_maxseg) * 2;
>>> +#else
>>> +   /* XXX: This is missing a lot of stuff that used to be in 7. */
>>> +#ifdef INET6
>>> +   else if ((isipv6 ? in6_localaddr(&CCV(ccv, t_inpcb->in6p_faddr)) :
>>> +   in_localaddr(CCV(ccv, t_inpcb->inp_faddr
>>> +#else
>>> +   else if (in_localaddr(CCV(ccv, t_inpcb->inp_faddr)))
>>> +#endif
>>> +   rw = V_ss_fltsz_local * CCV(ccv, t_maxseg);
>>> +   else
>>> +   rw = V_ss_fltsz * CCV(ccv, t_maxseg);
>>> +#endif
>>>  
>>> CCV(ccv, snd_cwnd) = min(rw, CCV(ccv, snd_cwnd));
>>>  }
>>>
>>> (But using the #else clause instead of the current #if 1 code).  Was this 
>>> change in 8 intentional?
>>
>> It was. Unlike connection initialisation which still honours ss_fltsz in
>> cc_conn_init(), restarting an idle connection based on ss_fltsz seemed
>> particularly dubious and as such was omitted from the refactored code.
>>
>> The ultimate goal was to remove the ss_fltsz hack completely and
>> implement a smarter mechanism, but that hasn't quite happened yet. The
>> removal of ss_fltsz from 10.x without providing a replacement mechanism
>> is not ideal and should probably be addressed.
>>
>> I'm guessing you're not using rfc3390 because you want to override the
>> initial window based on specific local knowledge of the path between
>> sender and receiver?
> 
> Correct, in 7.x we had cranked ss_fltsz up to a really high number to prevent
> the congestion window from collapsing when the connection was idle.  We have
> a bit of a unique workload in that we are using TCP to reliably forward a
> latency-sensitive datagram stream across a WAN connection with high bandwidth
> and high RTT.  Most of congestion control seems tuned to bulk transfers rather
> than this sort of use case.  The solution we have settled on here is to add a
> 

Re: [PATCH] Don't imply TCP and UDP socket options are bitmasks

2013-01-24 Thread Lawrence Stewart
On 01/23/13 07:28, John Baldwin wrote:
> On Tuesday, January 22, 2013 3:57:23 am Lawrence Stewart wrote:
>> On 01/16/13 06:16, John Baldwin wrote:
>>> On Tuesday, January 15, 2013 3:49:33 am Lawrence Stewart wrote:
 On 01/15/13 07:50, John Baldwin wrote:
> The constants used for TCP and UDP socket options (TCP_NODELAY, etc.) are 
> currently defined as hex values that are individual bits.  However, 
> socket 
> options are never masked together, they are used as a simple enumeration 
> of 
> discrete values.  Using a bitmask forces us to run out of bits and makes 
> it 
> harder for vendors to try to use a high range of values for local custom 
> options (hoping that they never conflict with a new option value added in 
> stock FreeBSD).

 Yup. Should we be explicitly #defining the boundary between "bits
 reserved for FreeBSD" and "bits for private vendor use"?
>>>
>>> Oh, we could if you wanted.  I'm using 0x1000 locally for both TCP and UDP,
>>> but those are completely arbitrary values.  Saner ones might be 0x800 if
>>> we want to do that explicitly.  We could perhaps just say that is true for 
>>> all
>>> socket option levels (that is, just define one SO_VENDOR constant or some 
>>> such
>>> but say it applies to all levels)?
>>
>> A single SO_VENDOR applied to all levels sounds good to me.
> 
> Ok, how about this for wording:
> 
> Index: sys/socket.h
> ===
> --- socket.h  (revision 245742)
> +++ socket.h  (working copy)
> @@ -143,6 +143,15 @@ typedef  __uid_t uid_t;
>  #endif
>  
>  /*
> + * Space reserved for new socket options added by third-party vendors.
> + * This range applies to all socket option levels.  New socket options
> + * in FreeBSD should always use an option value less than SO_VENDOR.
> + */
> +#if __BSD_VISIBLE
> +#define  SO_VENDOR   0x8000
> +#endif
> +
> +/*
>   * Structure used for manipulating linger option.
>   */
>  struct linger {

Two thumbs up from me.

We might also want to

#define TCP_VENDOR SO_VENDOR /* FreeBSD TCP socket options must be
numerically less than this. */

and so on in each file that defines option levels to provide some hint
to people that SO_VENDOR exists? Maybe we don't need the define and just
need to put the one line comment at the end of each set of options in
each file where a particular level's options are specified.

Cheers,
Lawrence
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-24 Thread Andre Oppermann

On 24.01.2013 03:31, Sepherosa Ziehau wrote:

On Thu, Jan 24, 2013 at 12:15 AM, John Baldwin  wrote:

On Wednesday, January 23, 2013 1:33:27 am Sepherosa Ziehau wrote:

On Wed, Jan 23, 2013 at 4:11 AM, John Baldwin  wrote:

As I mentioned in an earlier thread, I recently had to debug an issue we were
seeing across a link with a high bandwidth-delay product (both high bandwidth
and high RTT).  Our specific use case was to use a TCP connection to reliably
forward a latency-sensitive datagram stream across a WAN connection.  We would
often see spikes in the latency of individual datagrams.  I eventually tracked
this down to the connection entering slow start when it would transmit data
after being idle.  The data stream was quite bursty and would often attempt to
transmit a burst of data after being idle for far longer than a retransmit
timeout.

In 7.x we had worked around this in the past by disabling RFC 3390 and jacking
the slow start window size up via a sysctl.  On 8.x this no longer worked.
The solution I came up with was to add a new socket option to disable idle
handling completely.  That is, when an idle connection restarts with this new
option enabled, it keeps its current congestion window and doesn't enter slow
start.

There are only a few cases where such an option is useful, but if anyone else
thinks this might be useful I'd be happy to add the option to FreeBSD.


I think what you need is the RFC2861, however, you probably should
ignore the "application-limited period" part of RFC2861.


Hummm.  It appears btw, that Linux uses RFC 2861, but has a global knob to
disable it due to applictions having problems.  When it is disabled,
it doesn't decay the congestion window at all during idle handling.  That is,
it appears to act the same as if TCP_IGNOREIDLE were enabled.

 From http://www.kernel.org/doc/man-pages/online/pages/man7/tcp.7.html:

tcp_slow_start_after_idle (Boolean; default: enabled; since Linux 
2.6.18)
   If enabled, provide RFC 2861 behavior and time out the congestion
   window after an idle period.  An idle period is defined as the 
current
   RTO (retransmission timeout).  If disabled, the congestion 
window will
   not be timed out after an idle period.

Also, in this thread on tcp-m it appears no one on that list realizes that
there are any implementations which follow the "SHOULD" in RFC 2581 for idle
handling (which is what we do currently):


Nah, I don't think the idle detection in FreeBSD follows the
RFC2581/RFC5681 4.1 (the paragraph before the "SHOULD").  IMHO, that's
probably why the author in the following email requestioned about the
implementation of "SHOULD" in RFC2581/RFC5681.



http://www.ietf.org/mail-archive/web/tcpm/current/msg02864.html

So if we were to implement RFC 2861, the new socket option would be equivalent
to setting Linux's 'tcp_slow_start_after_idle' to false, but on a per-socket
basis rather than globally.


Agree, per-socket option could be useful than global sysctls under
certain situation.  However, in addition to the per-socket option,
could global sysctl nodes to disable idle_restart/idle_cwv help too?


No.  This is far too dangerous once it makes it into some tuning guide.
The threat of congestion breakdown is real.  The Internet, or any packet
network, can only survive in the long term if almost all follow the rules
and self-constrain to remain fair to the others.  What would happen if
nobody would respect the traffic lights anymore?

Besides that bursting into unknown network conditions is very likely to
result in burst losses as well.  TCP isn't good at recovering from it.
In the end you most likely come out ahead if you decay the restartCWND.

We have two cases primarily: a) long distance, medium to high RTT, and
wildly varying bandwidth (a.k.a. the Internet); b) short distance, low
RTT and mostly plenty of bandwidth (a.k.a. Datacenter).  The former
absolutely definately requires a decayed restartCWND.  The latter less
so but even there bursting at 10Gig TSO assisted wirespeed isn't going
to end too happy more often than not.

Since this seems to be a burning issue I'll come up with a patch in the
next days to add a decaying restartCWND that'll be fair and allow a very
quick ramp up if no loss occurs.

--
Andre

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"