e1000 serdes link flap

2013-01-22 Thread Neel Natu
Hi,

I am running into a problem in head with the e1000 link state
detection logic attached to a 82571EB serdes controller.

The symptom is that the link state keeps flapping between "up" and "down".

After I enabled the debug output in
'e1000_check_for_serdes_link_82571()' this is what I see:

e1000_check_for_serdes_link_82571
ctrl = 0x4c0241, status = 0x803a7, rxcw = 0x4400
FORCED_UP -> AN_PROG
em6: link state changed to DOWN
e1000_check_for_serdes_link_82571
ctrl = 0x4c0201, status = 0x803a4, rxcw = 0x4400
AN_PROG   -> FORCED_UP
em6: link state changed to UP
e1000_check_for_serdes_link_82571
ctrl = 0x4c0241, status = 0x803a7, rxcw = 0x4400
FORCED_UP -> AN_PROG
em6: link state changed to DOWN


The problem goes away if I apply the following patch to bring the link
state detection logic in line with the e1000e driver in Linux:

Index: e1000_82571.c
===
--- e1000_82571.c   (revision 245766)
+++ e1000_82571.c   (working copy)
@@ -1712,10 +1712,8 @@
 * auto-negotiation in the TXCW register and disable
 * forced link in the Device Control register in an
 * attempt to auto-negotiate with our link partner.
-* If the partner code word is null, stop forcing
-* and restart auto negotiation.
 */
-   if ((rxcw & E1000_RXCW_C) || !(rxcw & E1000_RXCW_CW))  {
+   if ((rxcw & E1000_RXCW_C) != 0) {
/* Enable autoneg, and unforce link up */
E1000_WRITE_REG(hw, E1000_TXCW, mac->txcw);
E1000_WRITE_REG(hw, E1000_CTRL,

I am not sure why the !(rxcw & E1000_RXCW_CW) check was added and the
e1000 SDM does not have any more information.

Jack, can you take a look at the patch and commit if it looks alright?

best
Neel
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/173475: [tun] tun(4) stays opened by PID after process is terminated

2013-01-22 Thread Emanuel Haupt
The following reply was made to PR kern/173475; it has been noted by GNATS.

From: Emanuel Haupt 
To: bug-follo...@freebsd.org, iz-freebsd0...@hs-karlsruhe.de
Cc:  
Subject: Re: kern/173475: [tun] tun(4) stays opened by PID after process is
 terminated
Date: Wed, 23 Jan 2013 07:59:07 +0100

 Could you please try the following vpnc patch? It tries to work around
 this deadlock situation:
 
 http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/175067
 
 However, the underlying problem with if_tun should be looked at
 separately in this PR.
 
 Emanuel
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-22 Thread Sepherosa Ziehau
On Wed, Jan 23, 2013 at 4:11 AM, John Baldwin  wrote:
> As I mentioned in an earlier thread, I recently had to debug an issue we were
> seeing across a link with a high bandwidth-delay product (both high bandwidth
> and high RTT).  Our specific use case was to use a TCP connection to reliably
> forward a latency-sensitive datagram stream across a WAN connection.  We would
> often see spikes in the latency of individual datagrams.  I eventually tracked
> this down to the connection entering slow start when it would transmit data
> after being idle.  The data stream was quite bursty and would often attempt to
> transmit a burst of data after being idle for far longer than a retransmit
> timeout.
>
> In 7.x we had worked around this in the past by disabling RFC 3390 and jacking
> the slow start window size up via a sysctl.  On 8.x this no longer worked.
> The solution I came up with was to add a new socket option to disable idle
> handling completely.  That is, when an idle connection restarts with this new
> option enabled, it keeps its current congestion window and doesn't enter slow
> start.
>
> There are only a few cases where such an option is useful, but if anyone else
> thinks this might be useful I'd be happy to add the option to FreeBSD.

I think what you need is the RFC2861, however, you probably should
ignore the "application-limited period" part of RFC2861.

Best Regards,
sephe

>
> Index: share/man/man4/tcp.4
> ===
> --- share/man/man4/tcp.4(revision 245742)
> +++ share/man/man4/tcp.4(working copy)
> @@ -205,6 +205,18 @@
>  in the
>  .Sx MIB Variables
>  section further down.
> +.It Dv TCP_IGNOREIDLE
> +If a TCP connection is idle for more than one retransmit timeout,
> +it enters slow start when new data is available to transmit.
> +This avoids flooding the network with a full window of traffic at line rate.
> +It also allows the connection to adjust to changes to network conditions
> +that occurred while the connection was idle.  A connection that sends
> +bursts of data separated by large idle periods can be permamently stuck in
> +slow start as a result.
> +The boolean option
> +.Dv TCP_IGNOREIDLE
> +disables the idle connection handling allowing connections to maintain the
> +existing congestion window when restarting after an idle period.
>  .It Dv TCP_NODELAY
>  Under most circumstances,
>  .Tn TCP
> Index: sys/netinet/tcp_var.h
> ===
> --- sys/netinet/tcp_var.h   (revision 245742)
> +++ sys/netinet/tcp_var.h   (working copy)
> @@ -230,6 +230,7 @@
>  #defineTF_NEEDFIN  0x000800/* send FIN (implicit state) 
> */
>  #defineTF_NOPUSH   0x001000/* don't push */
>  #defineTF_PREVVALID0x002000/* saved values for bad rxmit 
> valid */
> +#defineTF_IGNOREIDLE   0x004000/* connection is never idle */
>  #defineTF_MORETOCOME   0x01/* More data to be appended 
> to sock */
>  #defineTF_LQ_OVERFLOW  0x02/* listen queue overflow */
>  #defineTF_LASTIDLE 0x04/* connection was previously 
> idle */
> Index: sys/netinet/tcp_output.c
> ===
> --- sys/netinet/tcp_output.c(revision 245742)
> +++ sys/netinet/tcp_output.c(working copy)
> @@ -206,7 +206,8 @@
>  * to send, then transmit; otherwise, investigate further.
>  */
> idle = (tp->t_flags & TF_LASTIDLE) || (tp->snd_max == tp->snd_una);
> -   if (idle && ticks - tp->t_rcvtime >= tp->t_rxtcur)
> +   if (!(tp->t_flags & TF_IGNOREIDLE) &&
> +   idle && ticks - tp->t_rcvtime >= tp->t_rxtcur)
> cc_after_idle(tp);
> tp->t_flags &= ~TF_LASTIDLE;
> if (idle) {
> Index: sys/netinet/tcp.h
> ===
> --- sys/netinet/tcp.h   (revision 245823)
> +++ sys/netinet/tcp.h   (working copy)
> @@ -156,6 +156,7 @@
>  #defineTCP_NODELAY 1   /* don't delay send to coalesce 
> packets */
>  #if __BSD_VISIBLE
>  #defineTCP_MAXSEG  2   /* set maximum segment size */
> +#defineTCP_IGNOREIDLE  3   /* disable idle connection handling */
>  #define TCP_NOPUSH 4   /* don't push last block of write */
>  #define TCP_NOOPT  8   /* don't use TCP options */
>  #define TCP_MD5SIG 16  /* use MD5 digests (RFC2385) */
> Index: sys/netinet/tcp_usrreq.c
> ===
> --- sys/netinet/tcp_usrreq.c(revision 245742)
> +++ sys/netinet/tcp_usrreq.c(working copy)
> @@ -1354,6 +1354,7 @@
>
> case TCP_NODELAY:
> case TCP_NOOPT:
> +   case TCP_IGNOREIDLE:
> INP_WUNLOCK(inp);
> error = sooptcopyin(s

[lu...@freebsd.org: svn commit: r245836 - head/sys/dev/netmap]

2013-01-22 Thread Luigi Rizzo
this new netmap feature might be of interest

cheers
luigi

- Forwarded message from Luigi Rizzo  -

Date: Wed, 23 Jan 2013 05:37:46 + (UTC)
From: Luigi Rizzo 
Subject: svn commit: r245836 - head/sys/dev/netmap
To: src-committ...@freebsd.org, svn-src-...@freebsd.org,
svn-src-h...@freebsd.org

Author: luigi
Date: Wed Jan 23 05:37:45 2013
New Revision: 245836
URL: http://svnweb.freebsd.org/changeset/base/245836

Log:
  Add support for transparent mode while in netmap.
  
  By setting dev.netmap.fwd=1 (or enabling the feature with a per-ring flag),
  packets are forwarded between the NIC and the host stack unless the
  netmap client clears the NS_FORWARD flag on the individual descriptors.
  
  This feature greatly simplifies applications where some traffic
  (think of ARP, control traffic, ssh sessions...) must be processed
  by the host stack, whereas the bulk is handled by the netmap process
  which simply (un)marks packets that should not be forwarded.
  The default is chosen so that now a netmap receiver operates
  in a mode very similar to bpf.
  
  Of course there is no free lunch: traffic to/from the host stack
  still operates at OS speed (or less, as there is one extra copy in
  one direction).
  HOWEVER, since traffic goes to the user process before being
  reinjected, and reinjection occurs in a user context, you get some
  form of livelock protection for free.

Modified:
  head/sys/dev/netmap/netmap.c

Modified: head/sys/dev/netmap/netmap.c
==
--- head/sys/dev/netmap/netmap.cWed Jan 23 03:51:47 2013
(r245835)
+++ head/sys/dev/netmap/netmap.cWed Jan 23 05:37:45 2013
(r245836)
@@ -120,10 +120,12 @@ SYSCTL_INT(_dev_netmap, OID_AUTO, no_pen
 
 int netmap_drop = 0;   /* debugging */
 int netmap_flags = 0;  /* debug flags */
+int netmap_fwd = 0;/* force transparent mode */
 int netmap_copy = 0;   /* debugging, copy content */
 
 SYSCTL_INT(_dev_netmap, OID_AUTO, drop, CTLFLAG_RW, &netmap_drop, 0 , "");
 SYSCTL_INT(_dev_netmap, OID_AUTO, flags, CTLFLAG_RW, &netmap_flags, 0 , "");
+SYSCTL_INT(_dev_netmap, OID_AUTO, fwd, CTLFLAG_RW, &netmap_fwd, 0 , "");
 SYSCTL_INT(_dev_netmap, OID_AUTO, copy, CTLFLAG_RW, &netmap_copy, 0 , "");
 
 #ifdef NM_BRIDGE /* support for netmap bridge */
@@ -647,63 +649,170 @@ netmap_open(struct cdev *dev, int oflags
 
 /*
  * Handlers for synchronization of the queues from/to the host.
- *
- * netmap_sync_to_host() passes packets up. We are called from a
- * system call in user process context, and the only contention
- * can be among multiple user threads erroneously calling
- * this routine concurrently. In principle we should not even
- * need to lock.
+ * Netmap has two operating modes:
+ * - in the default mode, the rings connected to the host stack are
+ *   just another ring pair managed by userspace;
+ * - in transparent mode (XXX to be defined) incoming packets
+ *   (from the host or the NIC) are marked as NS_FORWARD upon
+ *   arrival, and the user application has a chance to reset the
+ *   flag for packets that should be dropped.
+ *   On the RXSYNC or poll(), packets in RX rings between
+ *   kring->nr_kcur and ring->cur with NS_FORWARD still set are moved
+ *   to the other side.
+ * The transfer NIC --> host is relatively easy, just encapsulate
+ * into mbufs and we are done. The host --> NIC side is slightly
+ * harder because there might not be room in the tx ring so it
+ * might take a while before releasing the buffer.
+ */
+
+/*
+ * pass a chain of buffers to the host stack as coming from 'dst'
  */
 static void
-netmap_sync_to_host(struct netmap_adapter *na)
+netmap_send_up(struct ifnet *dst, struct mbuf *head)
 {
-   struct netmap_kring *kring = &na->tx_rings[na->num_tx_rings];
-   struct netmap_ring *ring = kring->ring;
-   struct mbuf *head = NULL, *tail = NULL, *m;
-   u_int k, n, lim = kring->nkr_num_slots - 1;
+   struct mbuf *m;
 
-   k = ring->cur;
-   if (k > lim) {
-   netmap_ring_reinit(kring);
-   return;
+   /* send packets up, outside the lock */
+   while ((m = head) != NULL) {
+   head = head->m_nextpkt;
+   m->m_nextpkt = NULL;
+   if (netmap_verbose & NM_VERB_HOST)
+   D("sending up pkt %p size %d", m, MBUF_LEN(m));
+   NM_SEND_UP(dst, m);
}
-   // na->nm_lock(na->ifp, NETMAP_CORE_LOCK, 0);
+}
 
-   /* Take packets from hwcur to cur and pass them up.
+struct mbq {
+   struct mbuf *head;
+   struct mbuf *tail;
+   int count;
+};
+
+/*
+ * put a copy of the buffers marked NS_FORWARD into an mbuf chain.
+ * Run from hwcur to cur - reserved
+ */
+static void
+netmap_grab_packets(struct netmap_kring *kring, struct mbq *q, int force)
+{
+   /* Take packets from hwcur to cur-reserved and pass them up.
 * In case of no buffers we give up. At the end of 

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-22 Thread John Baldwin
On Tuesday, January 22, 2013 3:35:40 pm Alfred Perlstein wrote:
> On 1/22/13 12:11 PM, John Baldwin wrote:
> > As I mentioned in an earlier thread, I recently had to debug an issue we 
> > were
> > seeing across a link with a high bandwidth-delay product (both high 
> > bandwidth
> > and high RTT).  Our specific use case was to use a TCP connection to 
> > reliably
> > forward a latency-sensitive datagram stream across a WAN connection.  We 
> > would
> > often see spikes in the latency of individual datagrams.  I eventually 
> > tracked
> > this down to the connection entering slow start when it would transmit data
> > after being idle.  The data stream was quite bursty and would often attempt 
> > to
> > transmit a burst of data after being idle for far longer than a retransmit
> > timeout.
> >
> > In 7.x we had worked around this in the past by disabling RFC 3390 and 
> > jacking
> > the slow start window size up via a sysctl.  On 8.x this no longer worked.
> > The solution I came up with was to add a new socket option to disable idle
> > handling completely.  That is, when an idle connection restarts with this 
> > new
> > option enabled, it keeps its current congestion window and doesn't enter 
> > slow
> > start.
> >
> > There are only a few cases where such an option is useful, but if anyone 
> > else
> > thinks this might be useful I'd be happy to add the option to FreeBSD.
> 
> This looks good, but it almost sounds like a bug for TCP to be doing 
> this anyhow.
> 
> Why would one want this behavior?
> 
> Wouldn't it make sense to keep the window large until there was a 
> problem rather than unconditionally chop it down?  I almost think TCP is 
> afraid that you might wind up swapping out a 10gig interface for a 
> modem?  I'm just not getting it.  (probably simple oversight on my part).
> 
> What do you think about also making this a sysctl for global on/off by 
> default?

No, I think this is the proper default and RFC 5681 makes this a SHOULD.  The
burst at line rate argument is a very good one.  Normally if you have a stream
of data your data rate is clocked by the arrival of return ACKs (once you have
filled the window), and slow starts keeps you throttled at the beginning from
flooding the pipe.  However, if your connection becomes idle then you will
accumulate a large number of ACKs and be able to "spend" them all at once when
you get a burst of data to send.  This burst can then use a higher effective
bandwidth than the normal flow of traffic and could overwhelm a switch.

Also, for the cases where this is most useful (high RTT), it is not at all
unimaginable for network conditions to change dramatically.  In my use case we
have dedicated lines and control what goes across them so we don't have to
worry about that, but the general use case certainly needs to take that into
account.

-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Don't imply TCP and UDP socket options are bitmasks

2013-01-22 Thread John Baldwin
On Tuesday, January 22, 2013 3:57:23 am Lawrence Stewart wrote:
> On 01/16/13 06:16, John Baldwin wrote:
> > On Tuesday, January 15, 2013 3:49:33 am Lawrence Stewart wrote:
> >> On 01/15/13 07:50, John Baldwin wrote:
> >>> The constants used for TCP and UDP socket options (TCP_NODELAY, etc.) are 
> >>> currently defined as hex values that are individual bits.  However, 
> >>> socket 
> >>> options are never masked together, they are used as a simple enumeration 
> >>> of 
> >>> discrete values.  Using a bitmask forces us to run out of bits and makes 
> >>> it 
> >>> harder for vendors to try to use a high range of values for local custom 
> >>> options (hoping that they never conflict with a new option value added in 
> >>> stock FreeBSD).
> >>
> >> Yup. Should we be explicitly #defining the boundary between "bits
> >> reserved for FreeBSD" and "bits for private vendor use"?
> > 
> > Oh, we could if you wanted.  I'm using 0x1000 locally for both TCP and UDP,
> > but those are completely arbitrary values.  Saner ones might be 0x800 if
> > we want to do that explicitly.  We could perhaps just say that is true for 
> > all
> > socket option levels (that is, just define one SO_VENDOR constant or some 
> > such
> > but say it applies to all levels)?
> 
> A single SO_VENDOR applied to all levels sounds good to me.

Ok, how about this for wording:

Index: sys/socket.h
===
--- socket.h(revision 245742)
+++ socket.h(working copy)
@@ -143,6 +143,15 @@ typedef__uid_t uid_t;
 #endif
 
 /*
+ * Space reserved for new socket options added by third-party vendors.
+ * This range applies to all socket option levels.  New socket options
+ * in FreeBSD should always use an option value less than SO_VENDOR.
+ */
+#if __BSD_VISIBLE
+#defineSO_VENDOR   0x8000
+#endif
+
+/*
  * Structure used for manipulating linger option.
  */
 struct linger {


-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-22 Thread Andre Oppermann

On 22.01.2013 21:35, Alfred Perlstein wrote:

On 1/22/13 12:11 PM, John Baldwin wrote:

As I mentioned in an earlier thread, I recently had to debug an issue we were
seeing across a link with a high bandwidth-delay product (both high bandwidth
and high RTT).  Our specific use case was to use a TCP connection to reliably
forward a latency-sensitive datagram stream across a WAN connection.  We would
often see spikes in the latency of individual datagrams.  I eventually tracked
this down to the connection entering slow start when it would transmit data
after being idle.  The data stream was quite bursty and would often attempt to
transmit a burst of data after being idle for far longer than a retransmit
timeout.

In 7.x we had worked around this in the past by disabling RFC 3390 and jacking
the slow start window size up via a sysctl.  On 8.x this no longer worked.
The solution I came up with was to add a new socket option to disable idle
handling completely.  That is, when an idle connection restarts with this new
option enabled, it keeps its current congestion window and doesn't enter slow
start.

There are only a few cases where such an option is useful, but if anyone else
thinks this might be useful I'd be happy to add the option to FreeBSD.


This looks good, but it almost sounds like a bug for TCP to be doing this 
anyhow.


It's not a bug.  It's by design.  It's required by the RFC.


Why would one want this behavior?


Network conditions change all the time.  Traffic and congestion comes and goes.
Connections can go idle for milliseconds to minutes to hours.  Whenever "enough"
time has passed network capacity probing has to start anew.


Wouldn't it make sense to keep the window large until there was a problem 
rather than
unconditionally chop it down?  I almost think TCP is afraid that you might wind 
up swapping out a
10gig interface for a modem?  I'm just not getting it.  (probably simple 
oversight on my part).


The very real fear is congestion meltdown.  That is the reason we ended up with
TCP's AIMD mechanism in the first place.  If everybody were to blast into the
network anyone will suffer.  The bufferbloat issue identified recently makes 
things
even worse.


What do you think about also making this a sysctl for global on/off by default?


Please don't.  The correct fix is either a) to use the initial window as the 
restart
window (up to 10 MSS nowadays); b) to use a decay mechanism based on the time 
since
the last network condition probe.  Even the latter must decay to initCWND 
within at
most 1MSL.

--
Andre

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [JNPR] Proposal for changes to network device drivers and network stack (RFC)

2013-01-22 Thread Steve Kiernan
On Sun, 20 Jan 2013 10:56:24 +1100
Peter Jeremy  wrote:

> On 2013-Jan-17 14:38:06 -0500, "Stephen J. Kiernan"  
> wrote:
> >The patch also includes moving zlib.[ch] and zlibutil.h out of net and 
> >into sys/libkern (for the .c) and sys/sys (for the .h).
> 
> Good.
> 
> >It really doesn't make much sense for that code to live in net, 
> >especially when so many things which are not the network stack utilize 
> >it.
> 
> One thing that currently doesn't is ZFS - which has its own copy.  It
> would be nice if ZFS could use the common copy.

I'll take a look at that.

> 
> >Is that going to be a problem? Should simple stubs be added in the 
> >original locations in net/ to include the one in sys/ now?
> 
> IMHO, no.  zlib wasn't an advertised API so nothing outside the base
> OS should be using it.  If you've moved all the kernel code to use
> the new location, that should be enough.

Okay, then I won't worry about adding ones.

Thanks.

--
Stephen J. Kiernan
Juniper Networks, Inc.
stevek_at_juniper.net


signature.asc
Description: PGP signature


Cas driver fails to load first time after boot.

2013-01-22 Thread Paul Keusemann

Hi,

I've got a Dell R200 which I'm trying to build into a gateway with a Sun 
QGE (501-6738-10).  The cas driver fails to load the first time I try to 
load it but succeeds the second time.  Is this a problem with the card, 
the driver, my karma?



Initially, tried to install FreeBSD-9.1- Release but booting the 
installation DVD hangs after failing to attach cas0.  I was able to 
successfully install FreeBSD-8.3-Release which apparently does not have 
the cas driver built into the installer kernel.


The first time I try to load the cas module after booting results in the 
following output:


# kldload -v if_cas
Loaded if_cas, id=2

with the following logged to /var/log/messages:

Jan 22 13:48:27 lucid kernel: cas0:  
mem 0xdf80-0xdf9f irq 35 at device 0.0 on pci4

Jan 22 13:48:27 lucid kernel: cas0: attaching PHYs failed
Jan 22 13:48:27 lucid kernel: cas0: could not be attached
Jan 22 13:48:27 lucid kernel: device_attach: cas0 attach returned 6
Jan 22 13:48:27 lucid kernel: cas1:  
mem 0xdfa0-0xdfbf irq 34 at device 1.0 on pci4

Jan 22 13:48:27 lucid kernel: cas1: attaching PHYs failed
Jan 22 13:48:28 lucid kernel: cas1: could not be attached
Jan 22 13:48:28 lucid kernel: device_attach: cas1 attach returned 6
Jan 22 13:48:28 lucid kernel: cas2:  
mem 0xdfc0-0xdfdf irq 33 at device 2.0 on pci4

Jan 22 13:48:28 lucid kernel: miibus2:  on cas2
Jan 22 13:48:28 lucid kernel: nsgphy0: interface> PHY 1 on miibus2
Jan 22 13:48:28 lucid kernel: nsgphy0:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 13:48:28 lucid kernel: cas2: 16kB RX FIFO, 9kB TX FIFO
Jan 22 13:48:28 lucid kernel: cas2: Ethernet address: 00:14:4f:25:ca:12
Jan 22 13:48:28 lucid kernel: cas2: [FILTER]
Jan 22 13:48:28 lucid kernel: cas3:  
mem 0xdfe0-0xdfff irq 32 at device 3.0 on pci4

Jan 22 13:48:28 lucid kernel: cas3: attaching PHYs failed
Jan 22 13:48:28 lucid kernel: cas3: could not be attached
Jan 22 13:48:28 lucid kernel: device_attach: cas3 attach returned 6


If I unload the cas driver, I get the following in /var/log/messages:

Jan 22 14:03:42 lucid kernel: nsgphy0: detached
Jan 22 14:03:42 lucid kernel: miibus2: detached
Jan 22 14:03:42 lucid kernel: cas2: detached
Jan 22 14:03:42 lucid kernel: pci4:  at device 2.0 
(no driver attached)



The second time I try to load the cas kernel module after booting 
results in the following output:


# kldload -v if_cas
Loaded if_cas, id=2

and the following logged to /var/log/messages:

Jan 22 14:04:33 lucid kernel: cas0:  
mem 0xdf80-0xdf9f irq 35 at device 0.0 on pci4

Jan 22 14:04:33 lucid kernel: miibus2:  on cas0
Jan 22 14:04:33 lucid kernel: nsgphy0: interface> PHY 1 on miibus2
Jan 22 14:04:33 lucid kernel: nsgphy0:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 14:04:33 lucid kernel: cas0: 16kB RX FIFO, 9kB TX FIFO
Jan 22 14:04:33 lucid kernel: cas0: Ethernet address: 00:14:4f:25:ca:10
Jan 22 14:04:33 lucid kernel: cas0: [FILTER]
Jan 22 14:04:33 lucid kernel: cas1:  
mem 0xdfa0-0xdfbf irq 34 at device 1.0 on pci4

Jan 22 14:04:33 lucid kernel: miibus3:  on cas1
Jan 22 14:04:33 lucid kernel: nsgphy1: interface> PHY 1 on miibus3
Jan 22 14:04:33 lucid kernel: nsgphy1:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 14:04:33 lucid kernel: cas1: 16kB RX FIFO, 9kB TX FIFO
Jan 22 14:04:33 lucid kernel: cas1: Ethernet address: 00:14:4f:25:ca:11
Jan 22 14:04:33 lucid kernel: cas1: [FILTER]
Jan 22 14:04:33 lucid kernel: cas2:  
mem 0xdfc0-0xdfdf irq 33 at device 2.0 on pci4

Jan 22 14:04:33 lucid kernel: miibus4:  on cas2
Jan 22 14:04:33 lucid kernel: nsgphy2: interface> PHY 1 on miibus4
Jan 22 14:04:33 lucid kernel: nsgphy2:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 14:04:33 lucid kernel: cas2: 16kB RX FIFO, 9kB TX FIFO
Jan 22 14:04:33 lucid kernel: cas2: Ethernet address: 00:14:4f:25:ca:12
Jan 22 14:04:33 lucid kernel: cas2: [FILTER]
Jan 22 14:04:33 lucid kernel: cas3:  
mem 0xdfe0-0xdfff irq 32 at device 3.0 on pci4

Jan 22 14:04:33 lucid kernel: miibus5:  on cas3
Jan 22 14:04:33 lucid kernel: nsgphy3: interface> PHY 1 on miibus5
Jan 22 14:04:33 lucid kernel: nsgphy3:  none, 10baseT, 10baseT-FDX, 
100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 
1000baseT-FDX-master,auto, auto-flow

Jan 22 14:04:33 lucid kernel: cas3: 16kB RX FIFO, 9kB TX FIFO
Jan 22 14:04:33 lucid kernel: cas3: Ethernet address: 00:14:4f:25:ca:13
Jan 22 14:04:33 lucid kernel: cas3: [FILTER]


The following are attached:
/var/run/dmesg.boot
dmesg output after the second attempt to load the cas driver.
/var/log/messages after the second a

Re: [PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-22 Thread Alfred Perlstein

On 1/22/13 12:11 PM, John Baldwin wrote:

As I mentioned in an earlier thread, I recently had to debug an issue we were
seeing across a link with a high bandwidth-delay product (both high bandwidth
and high RTT).  Our specific use case was to use a TCP connection to reliably
forward a latency-sensitive datagram stream across a WAN connection.  We would
often see spikes in the latency of individual datagrams.  I eventually tracked
this down to the connection entering slow start when it would transmit data
after being idle.  The data stream was quite bursty and would often attempt to
transmit a burst of data after being idle for far longer than a retransmit
timeout.

In 7.x we had worked around this in the past by disabling RFC 3390 and jacking
the slow start window size up via a sysctl.  On 8.x this no longer worked.
The solution I came up with was to add a new socket option to disable idle
handling completely.  That is, when an idle connection restarts with this new
option enabled, it keeps its current congestion window and doesn't enter slow
start.

There are only a few cases where such an option is useful, but if anyone else
thinks this might be useful I'd be happy to add the option to FreeBSD.


This looks good, but it almost sounds like a bug for TCP to be doing 
this anyhow.


Why would one want this behavior?

Wouldn't it make sense to keep the window large until there was a 
problem rather than unconditionally chop it down?  I almost think TCP is 
afraid that you might wind up swapping out a 10gig interface for a 
modem?  I'm just not getting it.  (probably simple oversight on my part).


What do you think about also making this a sysctl for global on/off by 
default?


-Alfred



Index: share/man/man4/tcp.4
===
--- share/man/man4/tcp.4(revision 245742)
+++ share/man/man4/tcp.4(working copy)
@@ -205,6 +205,18 @@
  in the
  .Sx MIB Variables
  section further down.
+.It Dv TCP_IGNOREIDLE
+If a TCP connection is idle for more than one retransmit timeout,
+it enters slow start when new data is available to transmit.
+This avoids flooding the network with a full window of traffic at line rate.
+It also allows the connection to adjust to changes to network conditions
+that occurred while the connection was idle.  A connection that sends
+bursts of data separated by large idle periods can be permamently stuck in
+slow start as a result.
+The boolean option
+.Dv TCP_IGNOREIDLE
+disables the idle connection handling allowing connections to maintain the
+existing congestion window when restarting after an idle period.
  .It Dv TCP_NODELAY
  Under most circumstances,
  .Tn TCP
Index: sys/netinet/tcp_var.h
===
--- sys/netinet/tcp_var.h   (revision 245742)
+++ sys/netinet/tcp_var.h   (working copy)
@@ -230,6 +230,7 @@
  #define   TF_NEEDFIN  0x000800/* send FIN (implicit state) */
  #define   TF_NOPUSH   0x001000/* don't push */
  #define   TF_PREVVALID0x002000/* saved values for bad rxmit 
valid */
+#defineTF_IGNOREIDLE   0x004000/* connection is never idle */
  #define   TF_MORETOCOME   0x01/* More data to be appended to 
sock */
  #define   TF_LQ_OVERFLOW  0x02/* listen queue overflow */
  #define   TF_LASTIDLE 0x04/* connection was previously 
idle */
Index: sys/netinet/tcp_output.c
===
--- sys/netinet/tcp_output.c(revision 245742)
+++ sys/netinet/tcp_output.c(working copy)
@@ -206,7 +206,8 @@
 * to send, then transmit; otherwise, investigate further.
 */
idle = (tp->t_flags & TF_LASTIDLE) || (tp->snd_max == tp->snd_una);
-   if (idle && ticks - tp->t_rcvtime >= tp->t_rxtcur)
+   if (!(tp->t_flags & TF_IGNOREIDLE) &&
+   idle && ticks - tp->t_rcvtime >= tp->t_rxtcur)
cc_after_idle(tp);
tp->t_flags &= ~TF_LASTIDLE;
if (idle) {
Index: sys/netinet/tcp.h
===
--- sys/netinet/tcp.h   (revision 245823)
+++ sys/netinet/tcp.h   (working copy)
@@ -156,6 +156,7 @@
  #define   TCP_NODELAY 1   /* don't delay send to coalesce packets 
*/
  #if __BSD_VISIBLE
  #define   TCP_MAXSEG  2   /* set maximum segment size */
+#defineTCP_IGNOREIDLE  3   /* disable idle connection handling */
  #define TCP_NOPUSH4   /* don't push last block of write */
  #define TCP_NOOPT 8   /* don't use TCP options */
  #define TCP_MD5SIG16  /* use MD5 digests (RFC2385) */
Index: sys/netinet/tcp_usrreq.c
===
--- sys/netinet/tcp_usrreq.c(revision 245742)
+++ sys/netinet/tcp_usrreq.c(working copy)
@@ -1354,6 +1354,7 @@
  
  		case TCP_NODELAY:

   

[PATCH] Add a new TCP_IGNOREIDLE socket option

2013-01-22 Thread John Baldwin
As I mentioned in an earlier thread, I recently had to debug an issue we were 
seeing across a link with a high bandwidth-delay product (both high bandwidth 
and high RTT).  Our specific use case was to use a TCP connection to reliably 
forward a latency-sensitive datagram stream across a WAN connection.  We would 
often see spikes in the latency of individual datagrams.  I eventually tracked 
this down to the connection entering slow start when it would transmit data 
after being idle.  The data stream was quite bursty and would often attempt to 
transmit a burst of data after being idle for far longer than a retransmit 
timeout.

In 7.x we had worked around this in the past by disabling RFC 3390 and jacking 
the slow start window size up via a sysctl.  On 8.x this no longer worked.  
The solution I came up with was to add a new socket option to disable idle 
handling completely.  That is, when an idle connection restarts with this new 
option enabled, it keeps its current congestion window and doesn't enter slow 
start.

There are only a few cases where such an option is useful, but if anyone else 
thinks this might be useful I'd be happy to add the option to FreeBSD.

Index: share/man/man4/tcp.4
===
--- share/man/man4/tcp.4(revision 245742)
+++ share/man/man4/tcp.4(working copy)
@@ -205,6 +205,18 @@
 in the
 .Sx MIB Variables
 section further down.
+.It Dv TCP_IGNOREIDLE
+If a TCP connection is idle for more than one retransmit timeout,
+it enters slow start when new data is available to transmit.
+This avoids flooding the network with a full window of traffic at line rate.
+It also allows the connection to adjust to changes to network conditions
+that occurred while the connection was idle.  A connection that sends
+bursts of data separated by large idle periods can be permamently stuck in
+slow start as a result.
+The boolean option
+.Dv TCP_IGNOREIDLE
+disables the idle connection handling allowing connections to maintain the
+existing congestion window when restarting after an idle period.
 .It Dv TCP_NODELAY
 Under most circumstances,
 .Tn TCP
Index: sys/netinet/tcp_var.h
===
--- sys/netinet/tcp_var.h   (revision 245742)
+++ sys/netinet/tcp_var.h   (working copy)
@@ -230,6 +230,7 @@
 #defineTF_NEEDFIN  0x000800/* send FIN (implicit state) */
 #defineTF_NOPUSH   0x001000/* don't push */
 #defineTF_PREVVALID0x002000/* saved values for bad rxmit 
valid */
+#defineTF_IGNOREIDLE   0x004000/* connection is never idle */
 #defineTF_MORETOCOME   0x01/* More data to be appended to 
sock */
 #defineTF_LQ_OVERFLOW  0x02/* listen queue overflow */
 #defineTF_LASTIDLE 0x04/* connection was previously 
idle */
Index: sys/netinet/tcp_output.c
===
--- sys/netinet/tcp_output.c(revision 245742)
+++ sys/netinet/tcp_output.c(working copy)
@@ -206,7 +206,8 @@
 * to send, then transmit; otherwise, investigate further.
 */
idle = (tp->t_flags & TF_LASTIDLE) || (tp->snd_max == tp->snd_una);
-   if (idle && ticks - tp->t_rcvtime >= tp->t_rxtcur)
+   if (!(tp->t_flags & TF_IGNOREIDLE) &&
+   idle && ticks - tp->t_rcvtime >= tp->t_rxtcur)
cc_after_idle(tp);
tp->t_flags &= ~TF_LASTIDLE;
if (idle) {
Index: sys/netinet/tcp.h
===
--- sys/netinet/tcp.h   (revision 245823)
+++ sys/netinet/tcp.h   (working copy)
@@ -156,6 +156,7 @@
 #defineTCP_NODELAY 1   /* don't delay send to coalesce packets 
*/
 #if __BSD_VISIBLE
 #defineTCP_MAXSEG  2   /* set maximum segment size */
+#defineTCP_IGNOREIDLE  3   /* disable idle connection handling */
 #define TCP_NOPUSH 4   /* don't push last block of write */
 #define TCP_NOOPT  8   /* don't use TCP options */
 #define TCP_MD5SIG 16  /* use MD5 digests (RFC2385) */
Index: sys/netinet/tcp_usrreq.c
===
--- sys/netinet/tcp_usrreq.c(revision 245742)
+++ sys/netinet/tcp_usrreq.c(working copy)
@@ -1354,6 +1354,7 @@
 
case TCP_NODELAY:
case TCP_NOOPT:
+   case TCP_IGNOREIDLE:
INP_WUNLOCK(inp);
error = sooptcopyin(sopt, &optval, sizeof optval,
sizeof optval);
@@ -1368,6 +1369,9 @@
case TCP_NOOPT:
opt = TF_NOOPT;
break;
+   case TCP_IGNOREIDLE:
+   opt = TF_IGNOREIDLE;
+   break;
 

Re: Data Center Bridging?

2013-01-22 Thread Navdeep Parhar
On 01/22/13 07:43, Eggert, Lars wrote:
> Hi,
> 
> on Linux, various NICs (e.g., ixgbe) support Data Center Bridging. Is this 
> also available under FreeBSD? Do *any* NICs support DCB under FreeBSD?

cxgbe(4) hardware supports DCB/DCBX, but I haven't looked at what it
would take to add driver + OS support.

Regards,
Navdeep
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Processes' FIBs

2013-01-22 Thread Konstantin Belousov
On Tue, Jan 17, 2012 at 01:21:27PM +0100, Oliver Fromme wrote:
> Kostik Belousov  wrote:
>  > The patch misses compat32 bits and breaks compat32 ps/top.
> 
> Right, thank you for pointing it out!  I missed it because
> I only have i386 for testing.
> 
> I've created new patch sets for releng8 and current.  These
> include compat32 support and an entry for the manual page.
> 
> Would someone with amd64 please test the compat32 part?
> I've been using this code on i386 for a few days without
> any problems.
> 
> I've attached the patch for current below.  Both patch sets
> are also available from this URL:
> http://www.secnetix.de/olli/tmp/ki_fibnum/
> 
> Testing is easy:  Apply the patch, rebuild bin/ps and kernel.
> Make sure that your kernel config has "options ROUTETABLES=16"
> so multiple FIBs are supported.  Reboot.  Open a shell with
> setfib, e.g. "setfib 3 /bin/sh" (no root required), type
> "ps -ax -o user,pid,fib,command" or something similar, and
> verify that the shell process and its children are listed
> with the correct FIB.  When testing on amd64, use both the
> native ps and an i386 binary.
> 
> Thank you very much!
> 
> Best regards
>   Oliver
> 
> --- sys/sys/user.h.orig   2011-11-07 22:13:19.0 +0100
> +++ sys/sys/user.h2012-01-17 11:33:59.0 +0100
> @@ -83,7 +83,7 @@
>   * it in two places: function fill_kinfo_proc in sys/kern/kern_proc.c and
>   * function kvm_proclist in lib/libkvm/kvm_proc.c .
>   */
> -#define  KI_NSPARE_INT   9
> +#define  KI_NSPARE_INT   8
>  #define  KI_NSPARE_LONG  12
>  #define  KI_NSPARE_PTR   6
>  
> @@ -186,6 +186,7 @@
>*/
>   charki_sparestrings[50];/* spare string space */
>   int ki_spareints[KI_NSPARE_INT];/* spare room for growth */
> + int ki_fibnum;  /* Default FIB number */
>   u_int   ki_cr_flags;/* Credential flags */
>   int ki_jid; /* Process jail ID */
>   int ki_numthreads;  /* XXXKSE number of threads in total */
> --- sys/kern/kern_proc.c.orig 2012-01-15 19:47:24.0 +0100
> +++ sys/kern/kern_proc.c  2012-01-17 12:52:36.0 +0100
> @@ -836,6 +836,7 @@
>   kp->ki_swtime = (ticks - p->p_swtick) / hz;
>   kp->ki_pid = p->p_pid;
>   kp->ki_nice = p->p_nice;
> + kp->ki_fibnum = p->p_fibnum;
>   kp->ki_start = p->p_stats->p_start;
>   timevaladd(&kp->ki_start, &boottime);
>   PROC_SLOCK(p);
> @@ -1121,6 +1122,7 @@
>   bcopy(ki->ki_comm, ki32->ki_comm, COMMLEN + 1);
>   bcopy(ki->ki_emul, ki32->ki_emul, KI_EMULNAMELEN + 1);
>   bcopy(ki->ki_loginclass, ki32->ki_loginclass, LOGINCLASSLEN + 1);
> + CP(*ki, *ki32, ki_fibnum);
>   CP(*ki, *ki32, ki_cr_flags);
>   CP(*ki, *ki32, ki_jid);
>   CP(*ki, *ki32, ki_numthreads);
> --- sys/compat/freebsd32/freebsd32.h.orig 2011-11-11 08:17:00.0 
> +0100
> +++ sys/compat/freebsd32/freebsd32.h  2012-01-17 11:34:00.0 +0100
> @@ -319,6 +319,7 @@
>   charki_loginclass[LOGINCLASSLEN+1];
>   charki_sparestrings[50];
>   int ki_spareints[KI_NSPARE_INT];
> + int ki_fibnum;
>   u_int   ki_cr_flags;
>   int ki_jid;
>   int ki_numthreads;
> --- bin/ps/keyword.c.orig 2011-09-29 08:31:42.0 +0200
> +++ bin/ps/keyword.c  2012-01-17 12:54:49.0 +0100
> @@ -85,6 +85,7 @@
>   {"etimes", "ELAPSED", NULL, USER, elapseds, 0, CHAR, NULL, 0},
>   {"euid", "", "uid", 0, NULL, 0, CHAR, NULL, 0},
>   {"f", "F", NULL, 0, kvar, KOFF(ki_flag), INT, "x", 0},
> + {"fib", "FIB", NULL, 0, kvar, NULL, 2, KOFF(ki_fibnum), INT, "d", 0},
>   {"flags", "", "f", 0, NULL, 0, CHAR, NULL, 0},
>   {"gid", "GID", NULL, 0, kvar, KOFF(ki_groups), UINT, UIDFMT, 0},
>   {"group", "GROUP", NULL, LJUST, egroupname, 0, CHAR, NULL, 0},
> --- bin/ps/ps.1.orig  2011-11-22 22:53:06.0 +0100
> +++ bin/ps/ps.1   2012-01-17 12:56:17.0 +0100
> @@ -29,7 +29,7 @@
>  .\" @(#)ps.1 8.3 (Berkeley) 4/18/94
>  .\" $FreeBSD: src/bin/ps/ps.1,v 1.112 2011/11/22 21:53:06 trociny Exp $
>  .\"
> -.Dd November 22, 2011
> +.Dd January 17, 2012
>  .Dt PS 1
>  .Os
>  .Sh NAME
> @@ -506,6 +506,9 @@
>  minutes:seconds.
>  .It Cm etimes
>  elapsed running time, in decimal integer seconds
> +.It Cm fib
> +default FIB number, see
> +.Xr setfib 1
>  .It Cm flags
>  the process flags, in hexadecimal (alias
>  .Cm f )
Just reviving the recent thread after the ping.

The patch looks fine to me, and is still not committed.


pgp8X4YeXKBx4.pgp
Description: PGP signature


Re: Data Center Bridging?

2013-01-22 Thread Jack Vogel
On Tue, Jan 22, 2013 at 8:40 AM, Julian Elischer  wrote:

> On 1/22/13 9:32 AM, Julian Elischer wrote:
>
>> On 1/22/13 8:43 AM, Eggert, Lars wrote:
>>
>>> Hi,
>>>
>>> on Linux, various NICs (e.g., ixgbe) support Data Center Bridging. Is
>>> this also available under FreeBSD? Do *any* NICs support DCB under FreeBSD?
>>>
>>> Thanks,
>>> Lars
>>> __**_
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/**mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to 
>>> "freebsd-net-unsubscribe@**freebsd.org
>>> "
>>>
>>>  that really depends on what you want to do
>> given a freebsd command line and a kernel with the right options I could
>> probably bridge two networks in about 15 minutes.
>> (I have done it in the past).
>> if DCB is a specific protocol then we may need to do some work to support
>> it. as I've not heard of it.
>>
>
> google to the answer.
> I have not seen any support for that.
> I would check with the driver writers from intel and/or broadcom etc.
> (they should be here somewhere).
>   Jack?
>
>
I have never implemented this in the FreeBSD drivers primarily because the
motivation for it say, in Linux,
was to handle multiple traffic classes, for instance FCOE or iSCSI, but
FreeBSD has not had these features
to implement this for.  Give me a reason to do it, and I can see about
adding it :)

Jack
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4): m_getjcl: invalid cluster type

2013-01-22 Thread John Baldwin
The following reply was made to PR kern/172113; it has been noted by GNATS.

From: John Baldwin 
To: Jack Vogel 
Cc: "George Neville-Neil" ,
 bug-follo...@freebsd.org,
 egrosb...@rdtc.ru,
 j...@freebsd.org
Subject: Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in 
igb(4): m_getjcl: invalid cluster type
Date: Tue, 22 Jan 2013 12:09:32 -0500

 On Monday, January 21, 2013 3:28:40 pm Jack Vogel wrote:
 > Well, do you have a more complete designation of the motherboard? We can
 > look into it, although if the one check stops the problem it may be a low
 > priority.
 
 It is a SuperMicro X8DTU-F.
  
 > Jack
 > 
 > 
 > On Mon, Jan 21, 2013 at 11:25 AM, George Neville-Neil 
 > wrote:
 > 
 > >
 > > On Jan 19, 2013, at 23:26 , John Baldwin  wrote:
 > >
 > > > I was able to finally reproduce this panic today.  It seems to require
 > > > a server configured for PXE but that receives no DHCP reply (and
 > > > possibly with the requisite SuperMicro X8 board).  I was able to
 > > > prevent the panic with a subset of the referenced patch by only adding
 > > > the 'if_drv_flags & IFF_DRV_RUNNING' check to the start of
 > > > igb_msix_que().  The rest of the patch was unnecessary.  I also added
 > > > some debugging to print out the ICR, EICR, IMS, and EIMS registers in
 > > > this case.  It does look like the hardware is sending an interrupt that
 > > > is not enabled in the interrupt mask (specifically LSC).  In fact, the
 > > > 82576 datasheet specifically mentions masking LSC until initialization
 > > > is complete to avoid spurious interrupts during boot and AFAICT igb(4)
 > > > does this since e1000_reset_hw() clears the interrupt mask via writes
 > > > to IMC and doesn't re-enable interrupts until igb_init_locked() is
 > > > invoked via 'ifconfig up'.  Here is my debug output:
 > > >
 > > > SMP: AP CPU #6 Launched!
 > > > SMP: AP CPU #4 Launched!
 > > > stray irq0
 > > > igb0: interrupt on que 0: icr 0x104 eicr 0
 > > > ims 0 eims 0x8000
 > > >
 > > > Hmmm.   Nothing clears EIMS.  After some more debugging, I determined
 > > > that e1000_reset_hw() always turns this bit in EIMS on, even if it is
 > > > off before e1000_reset_hw() is called(!).  I added explicit calls to
 > > > igb_disable_intr() to clear EIMS after each call to e1000_reset_hw().
 > > > This removes the 'stray irq0', but I still get a spurious interrupt
 > > > during boot (albeit with eims 0).  I can use the IFF_DRV_RUNNING hack
 > > > for now, but I think the real fix is something else.
 > > >
 > >
 > > I think Jack will have to chime in on this one.  Do you think it's all SM
 > > X8 boards
 > > or just the one we happen to have?  I wonder if Jack or Jeffrey (the
 > > testing guy he works
 > > with) have access to the right board.
 > >
 > > Best,
 > > George
 > >
 > >
 > >
 > 
 
 -- 
 John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Properly handle Linux TCP socket options

2013-01-22 Thread John Baldwin
On Monday, January 21, 2013 2:55:22 pm Alexander Leidinger wrote:
> On Sat, 19 Jan 2013 11:26:13 -0500 John Baldwin  wrote:
> 
> > The current setsockopt() wrapper for the Linux ABI claims that Linux
> > and FreeBSD use the same values for TCP socket options.  This is true
> > for TCP_NODELAY and TCP_MAXSEG but not for any other options.  This
> > patch adds a mapping routine for TCP options similar to that used for
> > other socket option levels.  I believe this mapping to be correct in
> > terms of which FreeBSD options have the same semantics as Linux
> > options based on comparing code in the two kernels, but I'm not 100%
> > certain about TCP_MD5SIG since the Linux code that it maps to is not
> > as clear (it calls some function pointer and it is not clear if it is
> > accepting a simple boolean value similar to FreeBSD's).
> 
> What about a message for unknown options?

We do not do that now for any options (socket level or otherwise).  You could 
easily add that in linux_setsockopt(), but that should be a separate commit.

-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: two problems in dev/e1000/if_lem.c::lem_handle_rxtx()

2013-01-22 Thread John Baldwin
On Saturday, January 19, 2013 8:19:19 pm Adrian Chadd wrote:
> On 19 January 2013 08:14, John Baldwin  wrote:
> 
> > However, I did describe an alternate setup where you can fix this.  Part of
> > the key is to get various NICs to share a single logical queue of tasks.  
> > You
> > could simulate this now by having all the deferred tasks share a single
> > taskqueue with a pool of tasks, but that will still not fully cooperate with
> > ithreads.  To do that you have to get the interrupt handlers themselves into
> > the shared taskqueue.  Some changes I have in a p4 branch allow you to do 
> > that
> > by letting interrupt handlers reschedule themselves (avoiding the need for a
> > separate task and preventing the task from running concurrently with the
> > interrupt handler) and providing some (but not yet all) of the framework to
> > allow multiple devices to share a single work queue backed by a shared pool 
> > of
> > threads.
> 
> How would that work when I want to pin devices to specific cores?

Note that the setup allows you to bind things however you want.  By default it
uses the current model (each IRQ uses a dedicated queue with a single thread).
The idea is to provide the flexbility so that you can glue things together in
whatever way makes the most sense.  In a router that tends to get into livelock
using a shared queue may make more sense.  However, you are not forced to use 
that
for other workloads where it does not.

-- 
John Baldwin
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Data Center Bridging?

2013-01-22 Thread Jim Thompson

On Jan 22, 2013, at 10:32 AM, Julian Elischer  wrote:

> On 1/22/13 8:43 AM, Eggert, Lars wrote:
>> Hi,
>> 
>> on Linux, various NICs (e.g., ixgbe) support Data Center Bridging. Is this 
>> also available under FreeBSD? Do *any* NICs support DCB under FreeBSD?
>> 
>> Thanks,
>> Lars
>> ___
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>> 
> that really depends on what you want to do
> given a freebsd command line and a kernel with the right options I could 
> probably bridge two networks in about 15 minutes.
> (I have done it in the past).
> if DCB is a specific protocol then we may need to do some work to support it. 
> as I've not heard of it.

DCB is yet another attempt to 'fix' Ethernet, by eliminating queue overflow and 
providing bandwidth allocation on individual links.
It was intended to be mostly about storage.

Linux uses lldpad and dcbtool to manage the settings.

Jim

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Data Center Bridging?

2013-01-22 Thread Julian Elischer

On 1/22/13 9:32 AM, Julian Elischer wrote:

On 1/22/13 8:43 AM, Eggert, Lars wrote:

Hi,

on Linux, various NICs (e.g., ixgbe) support Data Center Bridging. 
Is this also available under FreeBSD? Do *any* NICs support DCB 
under FreeBSD?


Thanks,
Lars
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


that really depends on what you want to do
given a freebsd command line and a kernel with the right options I 
could probably bridge two networks in about 15 minutes.

(I have done it in the past).
if DCB is a specific protocol then we may need to do some work to 
support it. as I've not heard of it.


google to the answer.
I have not seen any support for that.
I would check with the driver writers from intel and/or broadcom etc. 
(they should be here somewhere).

  Jack?




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Data Center Bridging?

2013-01-22 Thread Julian Elischer

On 1/22/13 8:43 AM, Eggert, Lars wrote:

Hi,

on Linux, various NICs (e.g., ixgbe) support Data Center Bridging. Is this also 
available under FreeBSD? Do *any* NICs support DCB under FreeBSD?

Thanks,
Lars
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


that really depends on what you want to do
given a freebsd command line and a kernel with the right options I 
could probably bridge two networks in about 15 minutes.

(I have done it in the past).
if DCB is a specific protocol then we may need to do some work to 
support it. as I've not heard of it.



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Data Center Bridging?

2013-01-22 Thread Eggert, Lars
Hi,

on Linux, various NICs (e.g., ixgbe) support Data Center Bridging. Is this also 
available under FreeBSD? Do *any* NICs support DCB under FreeBSD?

Thanks,
Lars
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Tov?bb?t?s: [Ipsec-tools-users] freebsd & linux setup question

2013-01-22 Thread Richard Kojedzinszky

Dear Yvan,

I've recompiled racoon with NATT, but as you've said, only pure Internet 
is between A and B without NAT, and thus it did not solve my problem.


I've attached racoon's output from
# racoon -ddd -F
on the freebsd's side.

I can confirm, that setkey -D and -DP's output were full, so only the two 
entries existed for the SA's and policices.


I've tried a simple road-warrior setup, with transport mode, thus only 
traffic between A and B was protected, but that worked.

My server's racoon.conf is simple:
--
path certificate "/usr/local/etc/racoon/certs";

remote anonymous {
exchange_mode main,aggressive;
#   nat_traversal off;

certificate_type x509 "A.crt "A.key";
ca_type x509 "ca.crt";
my_identifier asn1dn;
peers_identifier asn1dn;
proposal_check strict ;

lifetime time 24 hour;

proposal {
encryption_algorithm aes256;
hash_algorithm sha1;
authentication_method rsasig;
dh_group 2;
}

generate_policy on ;
passive on ;

dpd_delay 60;
}

sainfo anonymous {
lifetime time 4 hour;

encryption_algorithm aes128 ;
authentication_algorithm hmac_md5 ;
compression_algorithm deflate;
}

log debug ;
--

And the client's is the same except the generate_policy and passive 
statements.


Thanks in advance,

Kojedzinszky Richard

On Tue, 22 Jan 2013, VANHULLEBUS Yvan wrote:


Hi.


On Mon, Jan 21, 2013 at 05:53:49PM +0100, kri...@cflinux.hu wrote:

Dear users,

I've a working tunnel setup between two linux hosts.

One end (A) has a fix address, while the other (B) has a dynamic one.
A is my server, B is my home router. Behind B, I've a private network.
What I've setup is that my private network reaches A through an IPSEC
tunnel.

[]

Now, I've decided to switc to freebsd on server side, and the same
configuration on the server simply does not work. It installs the
policies, and the tunnels, but it seems, that when a reply packet is
leaving the server, it tries to initiate a new tunnel. If I've "passive
on" on my server's remote section, then I've the following error:

Jan 21 16:06:11 pi racoon: ERROR: no configuration found for B.
Jan 21 16:06:11 pi racoon: ERROR: failed to begin ipsec sa negotication.

If I disable passive mode, then racoon tries to establish another tunnel,
but for some reason it does not succeed also. But I think, as in linux
it should work with passive on.

FreeBSD is 9.1-RELEASE, the linux side is a linux 3.5.4.

racoon on linux is:
# racoon -V
@(#)ipsec-tools 0.8.0 (http://ipsec-tools.sourceforge.net)

Compiled with:
- OpenSSL 1.0.0e 6 Sep 2011 (http://www.openssl.org/)
- Dead Peer Detection
- IKE fragmentation
- NAT Traversal
- Monotonic clock


racoon on freebsd is:
# racoon -V
@(#)ipsec-tools 0.8.0 (http://ipsec-tools.sourceforge.net)

Compiled with:
- OpenSSL 0.9.8x 10 May 2012 (http://www.openssl.org/)
- Dead Peer Detection
- IKE fragmentation
- Hybrid authentication
- Monotonic clock


You have NAT-T compiled/enabled on Linux side, but not on FreeBSD side
(probably because it is not activated as a kernel option).
If you have "something that does NAT" on the wire between A and B, it
is probably the origin of your problem.

However, as it seems that there is only "Internet" between A and B,
I'll suppose that the issue is somewhere else...



Unfortunately I've no idea.

Before the first packet, on the server:
# setkey -D
No SAD entries.

After an icmp packet sent from my private network to A:
# setkey -D
A B
esp mode=tunnel spi=76859998(0x0494ca5e) reqid=0(0x)
E: rijndael-cbc  1c80b80d b006e3a3 772c2a9b 5c475213
A: hmac-md5  d43ff29c 034c896a fb2e7d1c 95f73ff5
seq=0x replay=4 flags=0x state=mature
created: Jan 21 17:03:39 2013   current: Jan 21 17:05:54 2013
diff: 135(s)hard: 14400(s)  soft: 11520(s)
last:   hard: 0(s)  soft: 0(s)
current: 0(bytes)   hard: 0(bytes)  soft: 0(bytes)
allocated: 0hard: 0 soft: 0
sadb_seq=1 pid=93091 refcnt=1
B A
esp mode=tunnel spi=14479(0x08a151f0) reqid=0(0x)
E: rijndael-cbc  8bd59c29 9800d10f 8f9d7e84 a720aa9c
A: hmac-md5  188070e2 a3220772 78efcb06 3457db62
seq=0x0037 replay=4 flags=0x state=mature
created: Jan 21 17:03:39 2013   current: Jan 21 17:05:54 2013
diff: 135(s)hard: 14400(s)  soft: 11520(s)
last: Jan 21 17:04:50 2013  hard: 0(s)  soft: 0(s)
current: 5720(bytes)hard: 0(bytes)  soft: 0(bytes)
allocated: 55   hard: 0 soft: 0
sadb_seq=0 pid=93091 refcnt=1
# setkey -DP
10.0.0.0/24[any] A[any] any
in ipsec
esp/tunnel/B-A/require
created: Jan 21 17:03:39 2013  lastused: Jan 21 17:03:39 2013
lifetime: 14400(s) validtime: 0(s)
spid=25 seq=1 pid=5232
  

Re: Tov?bb?t?s: [Ipsec-tools-users] freebsd & linux setup question

2013-01-22 Thread VANHULLEBUS Yvan
Hi.


On Mon, Jan 21, 2013 at 05:53:49PM +0100, kri...@cflinux.hu wrote:
> Dear users,
> 
> I've a working tunnel setup between two linux hosts.
> 
> One end (A) has a fix address, while the other (B) has a dynamic one. 
> A is my server, B is my home router. Behind B, I've a private network. 
> What I've setup is that my private network reaches A through an IPSEC 
> tunnel.
[]
> Now, I've decided to switc to freebsd on server side, and the same 
> configuration on the server simply does not work. It installs the 
> policies, and the tunnels, but it seems, that when a reply packet is 
> leaving the server, it tries to initiate a new tunnel. If I've "passive 
> on" on my server's remote section, then I've the following error:
> 
> Jan 21 16:06:11 pi racoon: ERROR: no configuration found for B.
> Jan 21 16:06:11 pi racoon: ERROR: failed to begin ipsec sa negotication.
> 
> If I disable passive mode, then racoon tries to establish another tunnel, 
> but for some reason it does not succeed also. But I think, as in linux 
> it should work with passive on.
> 
> FreeBSD is 9.1-RELEASE, the linux side is a linux 3.5.4.
> 
> racoon on linux is:
> # racoon -V
> @(#)ipsec-tools 0.8.0 (http://ipsec-tools.sourceforge.net)
> 
> Compiled with:
> - OpenSSL 1.0.0e 6 Sep 2011 (http://www.openssl.org/)
> - Dead Peer Detection
> - IKE fragmentation
> - NAT Traversal
> - Monotonic clock
> 
> 
> racoon on freebsd is:
> # racoon -V
> @(#)ipsec-tools 0.8.0 (http://ipsec-tools.sourceforge.net)
> 
> Compiled with:
> - OpenSSL 0.9.8x 10 May 2012 (http://www.openssl.org/)
> - Dead Peer Detection
> - IKE fragmentation
> - Hybrid authentication
> - Monotonic clock

You have NAT-T compiled/enabled on Linux side, but not on FreeBSD side
(probably because it is not activated as a kernel option).
If you have "something that does NAT" on the wire between A and B, it
is probably the origin of your problem.

However, as it seems that there is only "Internet" between A and B,
I'll suppose that the issue is somewhere else...


> Unfortunately I've no idea.
> 
> Before the first packet, on the server:
> # setkey -D
> No SAD entries.
> 
> After an icmp packet sent from my private network to A:
> # setkey -D
> A B
>   esp mode=tunnel spi=76859998(0x0494ca5e) reqid=0(0x)
>   E: rijndael-cbc  1c80b80d b006e3a3 772c2a9b 5c475213
>   A: hmac-md5  d43ff29c 034c896a fb2e7d1c 95f73ff5
>   seq=0x replay=4 flags=0x state=mature
>   created: Jan 21 17:03:39 2013   current: Jan 21 17:05:54 2013
>   diff: 135(s)hard: 14400(s)  soft: 11520(s)
>   last:   hard: 0(s)  soft: 0(s)
>   current: 0(bytes)   hard: 0(bytes)  soft: 0(bytes)
>   allocated: 0hard: 0 soft: 0
>   sadb_seq=1 pid=93091 refcnt=1
> B A
>   esp mode=tunnel spi=14479(0x08a151f0) reqid=0(0x)
>   E: rijndael-cbc  8bd59c29 9800d10f 8f9d7e84 a720aa9c
>   A: hmac-md5  188070e2 a3220772 78efcb06 3457db62
>   seq=0x0037 replay=4 flags=0x state=mature
>   created: Jan 21 17:03:39 2013   current: Jan 21 17:05:54 2013
>   diff: 135(s)hard: 14400(s)  soft: 11520(s)
>   last: Jan 21 17:04:50 2013  hard: 0(s)  soft: 0(s)
>   current: 5720(bytes)hard: 0(bytes)  soft: 0(bytes)
>   allocated: 55   hard: 0 soft: 0
>   sadb_seq=0 pid=93091 refcnt=1
> # setkey -DP
> 10.0.0.0/24[any] A[any] any
>   in ipsec
>   esp/tunnel/B-A/require
>   created: Jan 21 17:03:39 2013  lastused: Jan 21 17:03:39 2013
>   lifetime: 14400(s) validtime: 0(s)
>   spid=25 seq=1 pid=5232
>   refcnt=1
> A[any] 10.0.0.0/24[any] any
>   out ipsec
>   esp/tunnel/A-B/require
>   created: Jan 21 17:03:39 2013  lastused: Jan 21 17:04:50 2013
>   lifetime: 14400(s) validtime: 0(s)
>   spid=26 seq=0 pid=5232
>   refcnt=1
> 
> Everything seems fine, as well it is in linux, howewer, the attached log 
> shows that the kernel or racoon does not try to use the new tunnel, 
> instead it wants another one.

Looks good.

Could you run racoon (on server's side) in debug mode (-dd) and send
the few lines that talk about trying to negociate a new tunnel ?
(Be careful, such racoon's debug contains sensitive informations)

What I'd like to have is the profil of the tunnel that kernel asks for
negociation.

Also, can you confirm that your setkey -DP output is the whole full
output ?


> Is it a bug in freebsd, or a feature in linux? Do somebody have experience 
> with such a setup?

Afaik, none of them, I use such setup and it works
The only difference in my configuration is that I have a network
behind both peers, but it should also work in your case.


Yvan.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [PATCH] Don't imply TCP and UDP socket options are bitmasks

2013-01-22 Thread Lawrence Stewart
On 01/16/13 06:16, John Baldwin wrote:
> On Tuesday, January 15, 2013 3:49:33 am Lawrence Stewart wrote:
>> On 01/15/13 07:50, John Baldwin wrote:
>>> The constants used for TCP and UDP socket options (TCP_NODELAY, etc.) are 
>>> currently defined as hex values that are individual bits.  However, socket 
>>> options are never masked together, they are used as a simple enumeration of 
>>> discrete values.  Using a bitmask forces us to run out of bits and makes it 
>>> harder for vendors to try to use a high range of values for local custom 
>>> options (hoping that they never conflict with a new option value added in 
>>> stock FreeBSD).
>>
>> Yup. Should we be explicitly #defining the boundary between "bits
>> reserved for FreeBSD" and "bits for private vendor use"?
> 
> Oh, we could if you wanted.  I'm using 0x1000 locally for both TCP and UDP,
> but those are completely arbitrary values.  Saner ones might be 0x800 if
> we want to do that explicitly.  We could perhaps just say that is true for all
> socket option levels (that is, just define one SO_VENDOR constant or some such
> but say it applies to all levels)?

A single SO_VENDOR applied to all levels sounds good to me.

Cheers,
Lawrence
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"