Re: move setfib(1)

2010-05-27 Thread M. Warner Losh
In message: 4bfe04e5.1010...@semmy.ru
Sergey Matveychuk s...@semmy.ru writes:
: 26.05.2010 20:38, Julian Elischer wrote:
:  On 5/26/10 9:32 AM, M. Warner Losh wrote:
:  In message:4bfd158d.7020...@freebsd.org
:  Sergey Matveychuks...@freebsd.org writes:
:  : Does is possible to move setfib(1) to /sbin for smooth using it in
:  : rc.d scripts?
: 
:  Can you tell us why you need it so early?
: 
:  We could do it, but eventually everything ends up moving to /sbin or
:  /bin unless we need a good reason.
: 
: 
: I'm thinking about this after Doug's message:
: http://lists.freebsd.org/pipermail/freebsd-rc/2010-May/001954.html

Right, and the only way that /usr/bin isn't going to be available if
the network isn't up will be if you have NFS mounted root, but have a
separate /usr partition.  Otherwise, critmount happens before the
network comes up, and that will ensure that you'll have /usr available
at the point in the boot scripts you want to use it.  Even if you have
/ and /usr separate on NFS partitions, you can specify netfs_types=
in the NFS root's rc.conf and all NFS mounts will mount too very
early.

Since you are proposing this for /etc/rc.d/routing, I think you can
actually use it there and there will be no problem, even for whacked
out NFS setups.

Did I miss something?

Warner

P.S.  On my system at least:

rcorder says:

/etc/rc.d/dumpon
/etc/rc.d/ddb
/etc/rc.d/initrandom
/etc/rc.d/geli
/etc/rc.d/gbde
/etc/rc.d/encswap
/etc/rc.d/ccd
/etc/rc.d/swap1
/etc/rc.d/fsck
/etc/rc.d/root
/etc/rc.d/hostid
/etc/rc.d/mdconfig
/etc/rc.d/mountcritlocal

so these wuold be the only places where you can't use binaries from
/usr, right?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: move setfib(1)

2010-05-27 Thread M. Warner Losh
In message: 20100527.001534.807935107107604070@bsdimp.com
M. Warner Losh i...@bsdimp.com writes:
: at the point in the boot scripts you want to use it.  Even if you have
: / and /usr separate on NFS partitions, you can specify netfs_types=
: in the NFS root's rc.conf and all NFS mounts will mount too very
: early.

s/will mount too very early/will mount early as well/g

Sorry.

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: move setfib(1)

2010-05-26 Thread M. Warner Losh
In message: 4bfd158d.7020...@freebsd.org
Sergey Matveychuk s...@freebsd.org writes:
: Does is possible to move setfib(1) to /sbin for smooth using it in
: rc.d scripts?

It is small enough.  I think that's a good idea.

However, it would only be a problem if we are mounting / and /usr off
NFS as separate partitions, right?

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Small patch to ipfilter for arm

2010-03-29 Thread M. Warner Losh
OK.  I'd like to propose the following patch for ipfilter:

Index: sys/contrib/ipfilter/netinet/ip_compat.h
===
--- sys/contrib/ipfilter/netinet/ip_compat.h(revision 205838)
+++ sys/contrib/ipfilter/netinet/ip_compat.h(working copy)
@@ -975,7 +975,6 @@
 #   define SPL_NET(x)  ;
 #   define SPL_IMP(x)  ;
 #   define SPL_SCHED(x);
-extern int in_cksum __P((struct mbuf *, int));
 #  else
 #   define SPL_SCHED(x)x = splhigh()
 #  endif /* __FreeBSD_version = 500043 */

This declaration is wrong, and it prevents arm from building ipfilter.

Why is it wrong?  Because we have:

#  if (__FreeBSD_version = 52)
#   include netinet/in_systm.h
#   include netinet/ip.h
#   include machine/in_cksum.h
#  endif

#  if (__FreeBSD_version = 500043)
...
the above code
...
#  endif

So, we have in_cksum.h being included *AND* we're defining this
function.  However, in_cksum.h is supposed to do this.

Why don't we see problems today?  No architecture except arm has an
assembler in_cksum in the tree.  All the other architectures have

#define in_cksum(a, b) in_cksum_skip(a, b, 0)

in their headers.  Since the above extern uses __P to hide the args,
in_cksum doesn't expand the macro, so we don't see any problems or
conflicts.  On arm, where we define in_cksum() correctly to return
u_short, there's a conflict.

So, it would best if we just dropped this one line from ip_compat.h,
since it was always wrong anyway.

Comments?

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: struct sockaddr * and alignment

2010-02-10 Thread M. Warner Losh
In message: b649e5e1002100148r759f3aacr3d5fcdfb5efd9...@mail.gmail.com
Marius Nünnerich mar...@nuenneri.ch writes:
: On Tue, Feb 9, 2010 at 18:34, M. Warner Losh i...@bsdimp.com wrote:
:  Greetings,
: 
:  I've found a few inconsistencies in the sockaddr stuff in the tree.
:  I'm not sure where to go to get a definitive answer, so I thought I'd
:  start here.
: 
:  I got here looking at the recent wake breakage on mips.  It turns out
:  that the warning was:
: 
:  src/usr.sbin/wake/wake.c: In function 'find_ether':
:  src/usr.sbin/wake/wake.c:123: warning: cast increases required alignment of 
target type
: 
:  which comes from
:         sdl = (struct sockaddr_dl *)ifa-ifa_addr;
: 
:  The problem is that on MIPS struct sockaddr * is byte aligned and
:  sockaddr_dl * is word aligned, so the compiler is rightly telling us
:  that there might be a problem here.
: 
:  However, further digging shows that there will never be a problem
:  here with alignment.  struct sockaddr_storage has a int64 in it to
:  force it to be __aligned(8).  So I thought to myself why don't I just
:  add __aligned(8) to the struct sockaddr definition?  After all, the
:  kernel goes to great lengths to return data so aligned, and user code
:  also keeps things aligned.
: 
:  Sure enough, that fixes this warning.  Yea.  But, sadly, it causes
:  other problems.  If you look at sbin/atm/atmconfig/natm.c you'll see
:  code like:
: 
:  static void
:  store_route(struct rt_msghdr *rtm)
:  {
:  ...
:         char *cp
:         struct sockaddr *sa;
:         ...
: 
:         cp = (char *)(rtm + 1);
:  ...
:                         sa = (struct sockaddr *)cp;
:                         cp += roundup(sa-sa_len, sizeof(long));
:  ...
: 
:  which breaks because we're now casting from an __aligned(1) char * to
:  an __aligned(8) sockaddr *.
: 
:  And it is only rounding the size of the structure to long, rather than
:  int64 like sockaddr_storage suggests is the proper alignment.  But I
:  haven't looked in the kernel to see if there's an issue there with
:  routing sockets or not.
: 
:  The other extreme is to put __aligned(1) on all the sockaddr_foo
:  structures.  This would solve the compiler warning, but would have a
:  negative effect on performance in accessing these elements (because
:  the compiler would have to generate calls to bcopy or equivalent to
:  access the scalar members that are larger than a byte).   This cure
:  would be worse than the disease.
: 
:  So the question here is What is the right solution here?  It has me
:  stumped.  So I dropped WARNS level down from 6 to 3 for wake.c.
: 
: Hi Warner,
: 
: I got into the same kind of trouble when I tried to raise the WARNS
: level above 3 for inetd and others. I guess everything which uses some
: sockaddr casting or (in the case of inetd) some of these macros:
: http://fxr.googlebit.com/source/sys/netinet6/in6.h?v=8-CURRENT#L233
: 
: It's a pity that only this keeps some programs from going to a WARNS level of 
6.

Well, if there were some way to tell the compiler Yes, I know this is
right then we'd be set.  But I've not found what that way might be.

Warner

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


struct sockaddr * and alignment

2010-02-09 Thread M. Warner Losh
Greetings,

I've found a few inconsistencies in the sockaddr stuff in the tree.
I'm not sure where to go to get a definitive answer, so I thought I'd
start here.

I got here looking at the recent wake breakage on mips.  It turns out
that the warning was:

src/usr.sbin/wake/wake.c: In function 'find_ether':
src/usr.sbin/wake/wake.c:123: warning: cast increases required alignment of 
target type

which comes from
sdl = (struct sockaddr_dl *)ifa-ifa_addr;

The problem is that on MIPS struct sockaddr * is byte aligned and
sockaddr_dl * is word aligned, so the compiler is rightly telling us
that there might be a problem here.

However, further digging shows that there will never be a problem
here with alignment.  struct sockaddr_storage has a int64 in it to
force it to be __aligned(8).  So I thought to myself why don't I just
add __aligned(8) to the struct sockaddr definition?  After all, the
kernel goes to great lengths to return data so aligned, and user code
also keeps things aligned.

Sure enough, that fixes this warning.  Yea.  But, sadly, it causes
other problems.  If you look at sbin/atm/atmconfig/natm.c you'll see
code like:

static void
store_route(struct rt_msghdr *rtm)
{
...
char *cp
struct sockaddr *sa;
...

cp = (char *)(rtm + 1);
...
sa = (struct sockaddr *)cp;
cp += roundup(sa-sa_len, sizeof(long));
...

which breaks because we're now casting from an __aligned(1) char * to
an __aligned(8) sockaddr *.

And it is only rounding the size of the structure to long, rather than
int64 like sockaddr_storage suggests is the proper alignment.  But I
haven't looked in the kernel to see if there's an issue there with
routing sockets or not.

The other extreme is to put __aligned(1) on all the sockaddr_foo
structures.  This would solve the compiler warning, but would have a
negative effect on performance in accessing these elements (because
the compiler would have to generate calls to bcopy or equivalent to
access the scalar members that are larger than a byte).   This cure
would be worse than the disease.

So the question here is What is the right solution here?  It has me
stumped.  So I dropped WARNS level down from 6 to 3 for wake.c.

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


How does rpc.lockd know where to send a request

2010-02-06 Thread M. Warner Losh
I have a problem.  All systems are running freebsd-current form
sometime in the last month, although similar systems running
8.0-RELEASE exhibit exactly the same problem.  rpc.lockd on an NFS
client is doing something that baffles my mind entirely, maybe you can
help.  Please bear with me, this is a little complicated, but I wanted
to include all the details.

I have a host, let's call it dune.  dune is at 10.0.0.5.  dune is also
the master for the carp interface 10.0.0.99.  It is running rpc.lockd
and is an nfs server.  I've told nfs, rpcbind, lockd and statd to only
listen on address 10.0.0.99.

I have a second host.  maud-dib is 10.0.0.8.  I do mount
10.0.0.99:/dune /dune on maud-dib.  Wireshark shows all the traffic
going to 10.0.0.99.  All is happy in the world.  When I start, there's
no ARP entry for 10.0.0.5 on 10.0.0.8, nor is there after the mount.

Until I do the following 'lockf /dune/imp/junk ls' (I have write perms
to /dune/imp).  At this point, rpc.lockd hangs.  I get the message
10.0.0.99:/dune: lockd not responding which seems odd.  lockd is
really there.  However, wireshark shows the NLM traffic going to IP
address 10.0.0.5.  maud-dib has no carp interfaces.

That's odd.  So my question is 'how does lockd know where to go to
talk the NLM protocol?'

I did a packet capture from before I did the mount on maud-dib.  I can
see the NFS mount, the NFS traffic, all to 10.0.0.99.  I then see an
ARP for 10.0.0.5, followed by the NLM request from 10.0.0.8 to
10.0.0.5.  This gets an ICMP port unreachable message, since I told
nfs, et al, to bind only to 10.0.0.99.

So, I thought, 'the answer is obvious, I'll just look for the packet
that has the string 'dune' in it (which is the hostname of 10.0.0.5).
No packets have that string in it, other than the mount packet which
has /dune in it.  Nor is there any DNS activity doing a lookup.  Nor
is there any static mapping in /etc/hosts on 10.0.0.8.

Next thought: Oh, somebody like portmapper or the NFS protocol from
10.0.0.99 is telling 10.0.0.8's rpc.lockd (or something else) to do
locking requests to 10.0.0.5.  That's trivial to find, I think to
myself.  I'll look for the octets 0a 00 00 05 (hex).  The only
instances of that are in the ARP packet, the NLM request and the ICMP
unreachable packets.  No other packets includes these bytes.  Nor do
any include the reverse.

Right after the mount, there's nothing in the connection table that
points to 10.0.0.5, only 10.0.0.99.

So I'm having a serious WTF moment.  How the heck is this even
possible.  Any ideas on where to look for where this gets set and/or
communicated?

thanks a bunch for any insight that you can give...

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: How does rpc.lockd know where to send a request

2010-02-06 Thread M. Warner Losh
In message: 4b6e2b40.1070...@elischer.org
Julian Elischer jul...@elischer.org writes:
: M. Warner Losh wrote:
:  I have a problem.  All systems are running freebsd-current form
:  sometime in the last month, although similar systems running
:  8.0-RELEASE exhibit exactly the same problem.  rpc.lockd on an NFS
:  client is doing something that baffles my mind entirely, maybe you can
:  help.  Please bear with me, this is a little complicated, but I wanted
:  to include all the details.
:  I have a host, let's call it dune.  dune is at 10.0.0.5.  dune is also
:  the master for the carp interface 10.0.0.99.  It is running rpc.lockd
:  and is an nfs server.  I've told nfs, rpcbind, lockd and statd to only
:  listen on address 10.0.0.99.
:  I have a second host.  maud-dib is 10.0.0.8.  I do mount
:  10.0.0.99:/dune /dune on maud-dib.  Wireshark shows all the traffic
:  going to 10.0.0.99.  All is happy in the world.  When I start, there's
:  no ARP entry for 10.0.0.5 on 10.0.0.8, nor is there after the mount.
:  Until I do the following 'lockf /dune/imp/junk ls' (I have write perms
:  to /dune/imp).  At this point, rpc.lockd hangs.  I get the message
:  10.0.0.99:/dune: lockd not responding which seems odd.  lockd is
:  really there.  However, wireshark shows the NLM traffic going to IP
:  address 10.0.0.5.  maud-dib has no carp interfaces.
:  That's odd.  So my question is 'how does lockd know where to go to
:  talk the NLM protocol?'
:  
: 
: my recollection is that maud-dib will sent an initial packet to dune
: and dune will respond but that the response may come from 10.0.0.5,
: after which maud-dib will redirect all requests there, which will not
: work because dune is not listenning there.

But wouldn't the response from 10.0.0.5 mean I could search for the
hex string and see 0a05 in the packet header?

: teh problem is that dune's daemon is setting a local address of
: IPADDR_ANY (0.0.0.0) which tells the packets to use a from
: address that is the address ofthe interface that they exit from.

No, dune's daemon is sitting on 10.0.0.99.

: Since 10.0.0.5 is the primary address on that interface, that gets
: selected.
: you may try some trickery where you add the .5 address AFTER the .99
: address so that the .99 is the primary address.

Normally, I'd believe you.  But since there's nothing listening on the
* address, and also nothing listening on the 10.0.0.5 address, I'm
less sure.  After looking at the wireshark dump, I don't see any
10.0.0.5 packets until the ARP for it near the end of the trace.

http://people.freebsd.org/~imp/wireshark.dat if you are interested.

This is a good theory, and I'll have to look into it deeper.

Warner


:  I did a packet capture from before I did the mount on maud-dib.  I can
:  see the NFS mount, the NFS traffic, all to 10.0.0.99.  I then see an
:  ARP for 10.0.0.5, followed by the NLM request from 10.0.0.8 to
:  10.0.0.5.  This gets an ICMP port unreachable message, since I told
:  nfs, et al, to bind only to 10.0.0.99.
:  So, I thought, 'the answer is obvious, I'll just look for the packet
:  that has the string 'dune' in it (which is the hostname of 10.0.0.5).
:  No packets have that string in it, other than the mount packet which
:  has /dune in it.  Nor is there any DNS activity doing a lookup.  Nor
:  is there any static mapping in /etc/hosts on 10.0.0.8.
:  Next thought: Oh, somebody like portmapper or the NFS protocol from
:  10.0.0.99 is telling 10.0.0.8's rpc.lockd (or something else) to do
:  locking requests to 10.0.0.5.  That's trivial to find, I think to
:  myself.  I'll look for the octets 0a 00 00 05 (hex).  The only
:  instances of that are in the ARP packet, the NLM request and the ICMP
:  unreachable packets.  No other packets includes these bytes.  Nor do
:  any include the reverse.
:  Right after the mount, there's nothing in the connection table that
:  points to 10.0.0.5, only 10.0.0.99.
:  So I'm having a serious WTF moment.  How the heck is this even
:  possible.  Any ideas on where to look for where this gets set and/or
:  communicated?
:  thanks a bunch for any insight that you can give...
:  Warner
:  ___
:  freebsd-net@freebsd.org mailing list
:  http://lists.freebsd.org/mailman/listinfo/freebsd-net
:  To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
: 
: 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: How does rpc.lockd know where to send a request

2010-02-06 Thread M. Warner Losh
In message: 4b6e2b40.1070...@elischer.org
Julian Elischer jul...@elischer.org writes:
: M. Warner Losh wrote:
:  I have a problem.  All systems are running freebsd-current form
:  sometime in the last month, although similar systems running
:  8.0-RELEASE exhibit exactly the same problem.  rpc.lockd on an NFS
:  client is doing something that baffles my mind entirely, maybe you can
:  help.  Please bear with me, this is a little complicated, but I wanted
:  to include all the details.
:  I have a host, let's call it dune.  dune is at 10.0.0.5.  dune is also
:  the master for the carp interface 10.0.0.99.  It is running rpc.lockd
:  and is an nfs server.  I've told nfs, rpcbind, lockd and statd to only
:  listen on address 10.0.0.99.
:  I have a second host.  maud-dib is 10.0.0.8.  I do mount
:  10.0.0.99:/dune /dune on maud-dib.  Wireshark shows all the traffic
:  going to 10.0.0.99.  All is happy in the world.  When I start, there's
:  no ARP entry for 10.0.0.5 on 10.0.0.8, nor is there after the mount.
:  Until I do the following 'lockf /dune/imp/junk ls' (I have write perms
:  to /dune/imp).  At this point, rpc.lockd hangs.  I get the message
:  10.0.0.99:/dune: lockd not responding which seems odd.  lockd is
:  really there.  However, wireshark shows the NLM traffic going to IP
:  address 10.0.0.5.  maud-dib has no carp interfaces.
:  That's odd.  So my question is 'how does lockd know where to go to
:  talk the NLM protocol?'
:  
: 
: my recollection is that maud-dib will sent an initial packet to dune
: and dune will respond but that the response may come from 10.0.0.5,
: after which maud-dib will redirect all requests there, which will not
: work because dune is not listenning there.
: 
: teh problem is that dune's daemon is setting a local address of
: IPADDR_ANY (0.0.0.0) which tells the packets to use a from
: address that is the address ofthe interface that they exit from.
: 
: Since 10.0.0.5 is the primary address on that interface, that gets
: selected.
: you may try some trickery where you add the .5 address AFTER the .99
: address so that the .99 is the primary address.

Actually, it looks like this is getting returned, as a ASCII string
'10.0.0.5' in frame 68 in response to the GETADDR call.  Since I've
told it specifically '-h 10.0.0.99' I'd have thought it would respect
that.  Since it is supposed to be bound to 10.0.0.99, I'd proffer the
argument this is a bug in rpcbind's implementation of GETADDR.

I never would have thought it would have been returned as an ASCII
string, but you live and learn, eh?

Now, on to fixing the bug.

Warner

P.S. http://people.freebsd.org/~imp/wireshark.dat has the trace I'm
referring to (and I've posted it in another message on this thread).

:  I did a packet capture from before I did the mount on maud-dib.  I can
:  see the NFS mount, the NFS traffic, all to 10.0.0.99.  I then see an
:  ARP for 10.0.0.5, followed by the NLM request from 10.0.0.8 to
:  10.0.0.5.  This gets an ICMP port unreachable message, since I told
:  nfs, et al, to bind only to 10.0.0.99.
:  So, I thought, 'the answer is obvious, I'll just look for the packet
:  that has the string 'dune' in it (which is the hostname of 10.0.0.5).
:  No packets have that string in it, other than the mount packet which
:  has /dune in it.  Nor is there any DNS activity doing a lookup.  Nor
:  is there any static mapping in /etc/hosts on 10.0.0.8.
:  Next thought: Oh, somebody like portmapper or the NFS protocol from
:  10.0.0.99 is telling 10.0.0.8's rpc.lockd (or something else) to do
:  locking requests to 10.0.0.5.  That's trivial to find, I think to
:  myself.  I'll look for the octets 0a 00 00 05 (hex).  The only
:  instances of that are in the ARP packet, the NLM request and the ICMP
:  unreachable packets.  No other packets includes these bytes.  Nor do
:  any include the reverse.
:  Right after the mount, there's nothing in the connection table that
:  points to 10.0.0.5, only 10.0.0.99.
:  So I'm having a serious WTF moment.  How the heck is this even
:  possible.  Any ideas on where to look for where this gets set and/or
:  communicated?
:  thanks a bunch for any insight that you can give...
:  Warner
:  ___
:  freebsd-net@freebsd.org mailing list
:  http://lists.freebsd.org/mailman/listinfo/freebsd-net
:  To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
: 
: 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [ed] link state constantly going down and up

2009-04-30 Thread M. Warner Losh
In message: 261c2970090434k4bc02635m2729a0a54c09c...@mail.gmail.com
Miki miki@gmail.com writes:
: 2009/4/29 M. Warner Losh i...@bsdimp.com
: 
:  : I have a problem with a D-Link DFE-670TXD which is handled by if_ed :
:  : the link state is constantly going down and up :
:  : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN
:  : Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP
:  ...
:  : the problem appear with the following commit :
:  : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126)
:  : I do not see any link state change if I revert the commit.
: 
:  Doh!
: 
:  I needed to force auto negotiation for other cards to work.  Let me
:  see if I can dig up the DFE-670TXD and go from there...  Are you also
:  seeing really horrible network performance as well?  Do you see this
:  only under load, or just at idle?
: 
:  Warner
: 
: 
: Yes network performance suffers from this. The problem only appears under
: load
: but not when idle.

Thanks.  I'll try to reproduce it here.  I noticed this on one of the
cards, but had trouble reproducing it, but I'll try harder.

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: [ed] link state constantly going down and up

2009-04-30 Thread M. Warner Losh
In message: 261c29700904300029s6757d39ei86fbf69ef816f...@mail.gmail.com
Miki miki@gmail.com writes:
: 2009/4/30 M. Warner Losh i...@bsdimp.com
: 
:  In message: 261c2970090434k4bc02635m2729a0a54c09c...@mail.gmail.com
: Miki miki@gmail.com writes:
:  : 2009/4/29 M. Warner Losh i...@bsdimp.com
:  :
:  :  : I have a problem with a D-Link DFE-670TXD which is handled by if_ed :
:  :  : the link state is constantly going down and up :
:  :  : Apr 28 14:21:33 iut-mir-o kernel: ed0: link state changed to DOWN
:  :  : Apr 28 14:21:35 iut-mir-o kernel: ed0: link state changed to UP
:  :  ...
:  :  : the problem appear with the following commit :
:  :  : SVN rev 190643 on 2009-04-02 16:58:45Z by imp (CVS rev 1.126)
:  :  : I do not see any link state change if I revert the commit.
:  : 
:  :  Doh!
:  : 
:  :  I needed to force auto negotiation for other cards to work.  Let me
:  :  see if I can dig up the DFE-670TXD and go from there...  Are you also
:  :  seeing really horrible network performance as well?  Do you see this
:  :  only under load, or just at idle?
:  : 
:  :  Warner
:  : 
:  :
:  : Yes network performance suffers from this. The problem only appears under
:  : load
:  : but not when idle.
: 
:  Thanks.  I'll try to reproduce it here.  I noticed this on one of the
:  cards, but had trouble reproducing it, but I'll try harder.
: 
:  Warner
: 
: 
: I can easily reproduce this by downloading an ISO image via ftp
: and doing a checkout of a subversion repository

OK.  I'll try that.  Do you know if you are able to trigger it with
ttcp too?  That's where I saw the odd symptoms before...

Warner

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Small change to ukphy

2009-04-01 Thread M. Warner Losh
I've encountered a number of PHY chips that need auto negotiation
kicked off to come out of ISO state.  This makes sense, because the
ukphy driver never seems to take the PHY out of isolation state
otherwise.

Index: ukphy.c
===
--- ukphy.c (revision 190463)
+++ ukphy.c (working copy)
@@ -146,6 +146,7 @@
sc-mii_phy = ma-mii_phyno;
sc-mii_service = ukphy_service;
sc-mii_pdata = mii;
+   sc-mii_flags |= MIIF_FORCEANEG;
 
mii-mii_instance++;
 

This forces auto negotiation.  The reason for this is that it takes it
out of ISO state (Isolate).  Once out of that state, things work
well.  The question I have is will we properly go back into ISO state
for PHYs that should be isolated.

NetBSD has many of its NIC drivers setting this flag.  Their APIs
allow them to set this directly at mii attach time.  Ours don't, so
none of our drivers set this flag.

The other fix for this might be:
Index: mii_physubr.c
===
--- mii_physubr.c   (revision 190463)
+++ mii_physubr.c   (working copy)
@@ -113,7 +113,9 @@
int bmcr, anar, gtcr;
 
if (IFM_SUBTYPE(ife-ifm_media) == IFM_AUTO) {
-   if ((PHY_READ(sc, MII_BMCR)  BMCR_AUTOEN) == 0 ||
+   bmcr = PHY_READ(sc, MII_BMCR);
+   if ((bmcr  BMCR_AUTOEN) == 0 ||
+   (bmcr  BMCR_ISO) ||
(sc-mii_flags  MIIF_FORCEANEG))
(void) mii_phy_auto(sc);
return;

Which says that if auto negotiation is enabled, and ISO is set to go
ahead and kick off an auto negotiation.  I'm less sure of this path,
but it is an alternative.  Otherwise, we never write to the BMCR to
take the device out of isolation.  If there's a better place to do
this, then I'm all ears.

Either one of these hacks make several PC Cards that I have start to
work...  In fact, I'm starting to approach 100% (up from 50%) of my
ed-based PC Cards working with this simple change (and others to the
ed driver).  I know that these cards are a little behind the leading
edge, but I'd like to get them working since I've put a few hours into
investigating things here.

Comments?

Warner

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Small change to ukphy

2009-04-01 Thread M. Warner Losh
In message: 20090401100939.gb12...@michelle.cdnetworks.co.kr
Pyun YongHyeon pyu...@gmail.com writes:
: On Wed, Apr 01, 2009 at 01:32:46AM -0600, M. Warner Losh wrote:
:  I've encountered a number of PHY chips that need auto negotiation
:  kicked off to come out of ISO state.  This makes sense, because the
:  ukphy driver never seems to take the PHY out of isolation state
:  otherwise.
:  
:  Index: ukphy.c
:  ===
:  --- ukphy.c (revision 190463)
:  +++ ukphy.c (working copy)
:  @@ -146,6 +146,7 @@
:  sc-mii_phy = ma-mii_phyno;
:  sc-mii_service = ukphy_service;
:  sc-mii_pdata = mii;
:  +   sc-mii_flags |= MIIF_FORCEANEG;
:   
:  mii-mii_instance++;
:   
:  
:  This forces auto negotiation.  The reason for this is that it takes it
:  out of ISO state (Isolate).  Once out of that state, things work
: 
: If the purpose is to take PHY out of isolated state couldn't this
: be handled in ifm_change_cb_t handler of parent interface? I guess
: the callback can reset the PHY and subsequent mii_mediachg() call
: may start auto-negotiation.

This callback isn't called.  The problem is that the PHY is in ISO
state.  Since it is in ISO state with auto negotiation enabled, we
never kick off an explicit auto negotiation, so the state never
changes so we never get this callback...

:  well.  The question I have is will we properly go back into ISO state
:  for PHYs that should be isolated.
:  
: 
: If the PHY requires special handing for ISO state in reset it may
: need separated PHY driver as ukphy(4) does not set MIIF_NOISOLATE. 
: As you said it would be really great if we have a generic way to
: pass various MII flags or driver specific information to mii(4).

This seems to be a common quirk.  I'd hate to have a driver that's
just ukphy but with the one line added above and play what-a-mole with
all the odd-balls that are out there.  Doesn't seem like a strategy
that will win the day.

I think we have a way to do this...  I could do the following in my
attach routine:

mii = device_get_softc(sc-miibus);
LIST_FOREACH(miisc, mii-mii_phys, mii_list) {
miisc-mii_flags |= MIIF_FORCEANEG;
mii_phy_reset(miisc);
}
mii_mediachg(mii);

which is similar to what fxp does in its change routine (it is what I
put in my status change routine).  Also MIIF_NOISOLATE works as well.

Is the above too insane?

Warner


:  NetBSD has many of its NIC drivers setting this flag.  Their APIs
:  allow them to set this directly at mii attach time.  Ours don't, so
:  none of our drivers set this flag.
:  
:  The other fix for this might be:
:  Index: mii_physubr.c
:  ===
:  --- mii_physubr.c   (revision 190463)
:  +++ mii_physubr.c   (working copy)
:  @@ -113,7 +113,9 @@
:  int bmcr, anar, gtcr;
:   
:  if (IFM_SUBTYPE(ife-ifm_media) == IFM_AUTO) {
:  -   if ((PHY_READ(sc, MII_BMCR)  BMCR_AUTOEN) == 0 ||
:  +   bmcr = PHY_READ(sc, MII_BMCR);
:  +   if ((bmcr  BMCR_AUTOEN) == 0 ||
:  +   (bmcr  BMCR_ISO) ||
:  (sc-mii_flags  MIIF_FORCEANEG))
:  (void) mii_phy_auto(sc);
:  return;
:  
:  Which says that if auto negotiation is enabled, and ISO is set to go
:  ahead and kick off an auto negotiation.  I'm less sure of this path,
:  but it is an alternative.  Otherwise, we never write to the BMCR to
:  take the device out of isolation.  If there's a better place to do
:  this, then I'm all ears.
:  
:  Either one of these hacks make several PC Cards that I have start to
:  work...  In fact, I'm starting to approach 100% (up from 50%) of my
:  ed-based PC Cards working with this simple change (and others to the
:  ed driver).  I know that these cards are a little behind the leading
:  edge, but I'd like to get them working since I've put a few hours into
:  investigating things here.
:  
:  Comments?
:  
:  Warner
: 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: Code review

2008-08-26 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
John Baldwin [EMAIL PROTECTED] writes:
: On Monday 25 August 2008 02:23:16 am M. Warner Losh wrote:
:  I did this a few years ago when trying to track down a problem with
:  some realtek network chips that I was having problems with at Timing
:  Solutions.  I'd like to get this into the tree, since it was helpful
:  then.
:  
:  Comments?
: 
: When you are running a faster tick I think want to only call the mii and 
: watchdog stuff once a second still.  I know this will break the tx watchdog 
: for example.  Since it's kind of tricky to manage that I think you should 
: just use a separate timer for the twister stuff.

Is this in general, or do you have a specific problem in mind with the
rl change?  In general, we're not transmitting during this exercise
and it happens only once...  Is it worth the extra hair?

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Code review

2008-08-25 Thread M. Warner Losh
I did this a few years ago when trying to track down a problem with
some realtek network chips that I was having problems with at Timing
Solutions.  I'd like to get this into the tree, since it was helpful
then.

Comments?

Warner
diff -ur src/sys/pci/if_rl.c newcard/src/sys/pci/if_rl.c
--- src/sys/pci/if_rl.c 2008-08-23 22:21:15.0 -0600
+++ newcard/src/sys/pci/if_rl.c 2008-08-23 22:26:09.0 -0600
@@ -1253,18 +1253,120 @@
 }
 
 static void
+rl_twister_update(struct rl_softc *sc)
+{
+   uint16_t linktest;
+   static const uint32_t param[4][4] = {
+   {0xcb39de43, 0xcb39ce43, 0xfb38de03, 0xcb38de43},
+   {0xcb39de43, 0xcb39ce43, 0xcb39ce83, 0xcb39ce83},
+   {0xcb39de43, 0xcb39ce43, 0xcb39ce83, 0xcb39ce83},
+   {0xbb39de43, 0xbb39ce43, 0xbb39ce83, 0xbb39ce83}
+   };
+
+   /*
+* Tune the so-called twister registers of the RTL8139.  These
+* are used to compensate for impendence mismatches.  The
+* method for tuning these registes is undocumented and the
+* following proceedure is collected from public sources.
+*/
+   switch (sc-rl_twister)
+   {
+   case CHK_LINK:
+   /*
+* If we have a sufficent link, then we can proceed in
+* the state machine to the next stage.  If not, then
+* disable further tuning after writing sane defaults.
+*/
+   if (CSR_READ_2(sc, RL_CSCFG)  RL_CSCFG_LINK_OK) {
+   CSR_WRITE_2(sc, RL_CSCFG, RL_CSCFG_LINK_DOWN_OFF_CMD);
+   sc-rl_twister = FIND_ROW;
+   } else {
+   CSR_WRITE_2(sc, RL_CSCFG, RL_CSCFG_LINK_DOWN_CMD);
+   CSR_WRITE_4(sc, RL_NWAYTST, RL_NWAYTST_CBL_TEST);
+   CSR_WRITE_4(sc, RL_PARA78, RL_PARA78_DEF);
+   CSR_WRITE_4(sc, RL_PARA7C, RL_PARA7C_DEF);
+   sc-rl_twister = DONE;
+   }
+   break;
+   case FIND_ROW:
+   /*
+* Read how long it took to see the echo to find the tuning
+* row to use.
+*/
+   linktest = CSR_READ_2(sc, RL_CSCFG)  RL_CSCFG_STATUS;
+   if (linktest == RL_CSCFG_ROW3)
+   sc-rl_twist_row = 3;
+   else if (linktest == RL_CSCFG_ROW2)
+   sc-rl_twist_row = 2;
+   else if (linktest == RL_CSCFG_ROW1)
+   sc-rl_twist_row = 1;
+   else
+   sc-rl_twist_row = 0;
+   sc-rl_twist_col = 0;
+   sc-rl_twister = SET_PARAM;
+   break;
+   case SET_PARAM:
+   if (sc-rl_twist_col == 0)
+   CSR_WRITE_4(sc, RL_NWAYTST, RL_NWAYTST_RESET);
+   CSR_WRITE_4(sc, RL_PARA7C,
+   param[sc-rl_twist_row][sc-rl_twist_col]);
+   if (++sc-rl_twist_col == 4) {
+   if (sc-rl_twist_row == 3)
+   sc-rl_twister = RECHK_LONG;
+   else
+   sc-rl_twister = DONE;
+   }
+   break;
+   case RECHK_LONG:
+   /*
+* For long cables, we have to double check to make sure we
+* don't mistune.
+*/
+   linktest = CSR_READ_2(sc, RL_CSCFG)  RL_CSCFG_STATUS;
+   if (linktest == RL_CSCFG_ROW3)
+   sc-rl_twister = DONE;
+   else {
+   CSR_WRITE_4(sc, RL_PARA7C, RL_PARA7C_RETUNE);
+   sc-rl_twister = RETUNE;
+   }
+   break;
+   case RETUNE:
+   /* Retune for a shorter cable (try column 2) */
+   CSR_WRITE_4(sc, RL_NWAYTST, RL_NWAYTST_CBL_TEST);
+   CSR_WRITE_4(sc, RL_PARA78, RL_PARA78_DEF);
+   CSR_WRITE_4(sc, RL_PARA7C, RL_PARA7C_DEF);
+   CSR_WRITE_4(sc, RL_NWAYTST, RL_NWAYTST_RESET);
+   sc-rl_twist_row--;
+   sc-rl_twist_col = 0;
+   sc-rl_twister = SET_PARAM;
+   break;
+
+   case DONE:
+   break;
+   }
+   
+}
+
+static void
 rl_tick(void *xsc)
 {
struct rl_softc *sc = xsc;
struct mii_data *mii;
+   int ticks;
 
RL_LOCK_ASSERT(sc);
mii = device_get_softc(sc-rl_miibus);
mii_tick(mii);
+   if (sc-rl_twister != DONE)
+   rl_twister_update(sc);
+   if (sc-rl_twister != DONE)
+   ticks = hz / 10;
+   else
+   ticks = hz;
 
rl_watchdog(sc);
 
-   callout_reset(sc-rl_stat_callout, hz, rl_tick, sc);
+   callout_reset(sc-rl_stat_callout, ticks, rl_tick, sc);
 }
 
 #ifdef DEVICE_POLLING
@@ -1490,6 +1592,13 @@
rl_stop(sc);
 
/*
+* Reset 

Code review request

2008-08-24 Thread M. Warner Losh
I've been shepherding this patch in my p4 tree for a long time.  It
removes the obsolete support for other systems in if_spppsubr.c.  Is
there a reason I shouldn't commit this?

Warner
Index: if_spppsubr.c
===
--- if_spppsubr.c   (revision 182085)
+++ if_spppsubr.c   (working copy)
@@ -23,38 +23,22 @@
 
 #include sys/param.h
 
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
 #include opt_inet.h
 #include opt_inet6.h
 #include opt_ipx.h
-#endif
 
-#ifdef NetBSD1_3
-#  if NetBSD1_3  6
-#  include opt_inet.h
-#  include opt_inet6.h
-#  include opt_iso.h
-#  endif
-#endif
-
 #include sys/systm.h
 #include sys/kernel.h
 #include sys/module.h
 #include sys/sockio.h
 #include sys/socket.h
 #include sys/syslog.h
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
 #include sys/random.h
-#endif
 #include sys/malloc.h
 #include sys/mbuf.h
 #include sys/vimage.h
 
-#if defined (__OpenBSD__)
-#include sys/md5k.h
-#else
 #include sys/md5.h
-#endif
 
 #include net/if.h
 #include net/netisr.h
@@ -65,10 +49,6 @@
 #include netinet/ip.h
 #include net/slcompress.h
 
-#if defined (__NetBSD__) || defined (__OpenBSD__)
-#include machine/cpu.h /* XXX for softnet */
-#endif
-
 #include machine/stdarg.h
 
 #include netinet/in_var.h
@@ -82,11 +62,7 @@
 #include netinet6/scope6_var.h
 #endif
 
-#if defined (__FreeBSD__) || defined (__OpenBSD__)
-# include netinet/if_ether.h
-#else
-# include net/ethertypes.h
-#endif
+#include netinet/if_ether.h
 
 #ifdef IPX
 #include netipx/ipx.h
@@ -95,12 +71,7 @@
 
 #include net/if_sppp.h
 
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
-# define IOCTL_CMD_T   u_long
-#else
-# define IOCTL_CMD_T   int
-#endif
-
+#define IOCTL_CMD_Tu_long
 #define MAXALIVECNT 3   /* max. alive packets */
 
 /*
@@ -261,13 +232,8 @@
void(*scr)(struct sppp *sp);
 };
 
-#if defined(__FreeBSD__)  __FreeBSD__ = 3  __FreeBSD_version  501113
-#defineSPP_FMT %s%d: 
-#defineSPP_ARGS(ifp)   (ifp)-if_name, (ifp)-if_unit
-#else
 #defineSPP_FMT %s: 
 #defineSPP_ARGS(ifp)   (ifp)-if_xname
-#endif
 
 #define SPPP_LOCK(sp) \
do { \
@@ -1422,11 +1388,7 @@
++sp-pp_loopcnt;
 
/* Generate new local sequence number */
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
sp-pp_seq[IDX_LCP] = random();
-#else
-   sp-pp_seq[IDX_LCP] ^= time.tv_sec ^ time.tv_usec;
-#endif
break;
}
sp-pp_loopcnt = 0;
@@ -2671,11 +2633,7 @@
if (magic == ~sp-lcp.magic) {
if (debug)
log(-1, magic glitch );
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
sp-lcp.magic = random();
-#else
-   sp-lcp.magic = time.tv_sec + 
time.tv_usec;
-#endif
} else {
sp-lcp.magic = magic;
if (debug)
@@ -2856,11 +2814,7 @@
 
if (sp-lcp.opts  (1  LCP_OPT_MAGIC)) {
if (! sp-lcp.magic)
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
sp-lcp.magic = random();
-#else
-   sp-lcp.magic = time.tv_sec + time.tv_usec;
-#endif
opt[i++] = LCP_OPT_MAGIC;
opt[i++] = 6;
opt[i++] = sp-lcp.magic  24;
@@ -4383,15 +4337,7 @@
 
/* Compute random challenge. */
ch = (u_long *)sp-myauth.challenge;
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
read_random(seed, sizeof seed);
-#else
-   {
-   struct timeval tv;
-   microtime(tv);
-   seed = tv.tv_sec ^ tv.tv_usec;
-   }
-#endif
ch[0] = seed ^ random();
ch[1] = seed ^ random();
ch[2] = seed ^ random();
@@ -4900,17 +4846,7 @@
 * aliases don't make any sense on a p2p link anyway.
 */
si = 0;
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
TAILQ_FOREACH(ifa, ifp-if_addrhead, ifa_link)
-#elif defined(__NetBSD__) || defined (__OpenBSD__)
-   for (ifa = TAILQ_FIRST(ifp-if_addrlist);
-ifa;
-ifa = TAILQ_NEXT(ifa, ifa_list))
-#else
-   for (ifa = ifp-if_addrlist;
-ifa;
-ifa = ifa-ifa_next)
-#endif
if (ifa-ifa_addr-sa_family == AF_INET) {
si = (struct sockaddr_in *)ifa-ifa_addr;
sm = (struct sockaddr_in *)ifa-ifa_netmask;
@@ -4949,17 +4885,7 @@
 * aliases don't make any sense on a p2p link anyway.
 */
si = 0;
-#if defined(__FreeBSD__)  __FreeBSD__ = 3
TAILQ_FOREACH(ifa, ifp-if_addrhead, ifa_link)
-#elif defined(__NetBSD__) || defined (__OpenBSD__)
-   for (ifa = TAILQ_FIRST(ifp-if_addrlist);
-ifa;
-

Re: Code review request

2008-08-24 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Roman Kurakin [EMAIL PROTECTED] writes:
: M. Warner Losh wrote:
:  I've been shepherding this patch in my p4 tree for a long time.  It
:  removes the obsolete support for other systems in if_spppsubr.c.  Is
:  there a reason I shouldn't commit this?
:
: It was there to ease the keeping code in sync with other systems.
: Please ask Joerg Wunsch before removal.

Yea, I knew that history.  But there's been a lot of hacking on this
file, and the ifdef's are for other systems that were contemporaneous
with FreeBSD 3.0, but nothing newer.  Plus the FreeBSDisms that are
newer haven't been ifdef'd.

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEAD UP: non-MPSAFE network drivers to be disabled

2008-06-01 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Robert Watson [EMAIL PROTECTED] writes:
: 
: On Mon, 26 May 2008, Bruce M. Simpson wrote:
: 
:  Given that this is (a) 2008 and (b) 8.x we're talking about, are there 
:  really that many consumers of SLIP to warrant it being carried forward at 
:  all?
: 
:  It's kind of a basic. [C]SLIP has been historically handy to have around 
for 
:  situations which warrant it. Mind you, given that we have had tun(4) in the 
:  tree for years now, a userland implementation of SLIP is possible.
: 
:  As with all of these things it's down to someone sitting down and doing it.
: 
:  I'm not volunteering to support any of this as I don't use it myself (got 
:  enough on my plate), merely pointing out that support for SLIP in a system 
:  is something many people have taken for granted over the years, and for 
:  prototyping something or providing IP over a simple serial link without the 
:  configuration overhead of PPP, SLIP is something someone might be using.
: 
:  P.S. ahc(4) is commodity hardware, I think it can stay right where it is 
:  thank-you.
: 
: My suspicion is that getting SLIP basically working in userspace is fairly 
: straight forward,

SLiRP and friends have been doing this for years.  I made my living
for about a year working on TIA, which was a portable, userland
implementation of PPP and SLIP/CSLIP.  This was in about 1995 or so.
It isn't that hard...

: SLIP has its subtleties, but the current implementation is relatively 
: straight-forward, well-documented, etc.

Yes, especially CSLIP.  But frankly, they are a whole lot easier than
PPP to get up and going...

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEAD UP: non-MPSAFE network drivers to be disabled

2008-06-01 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Bruce M. Simpson [EMAIL PROTECTED] writes:
: Other than that, line disciplines can go away.

In the past I've uesd line disciplines to implement a keyboard driver
for the Apple Newton Keyboard (serial protocol) so I could use it at
any point after the loader (the system didn't run X11, so I couldn't
use the X11 driver I wrote there).  This system has been retired, and
I don't think I ever forward ported them past about 3.mumble, if even
that far.

This code is badly decayed, and I have no requirement that it
continues to work.  But I know similar techniques are used in some
embedded systems.  Expect some delayed complaining if they go away
entirely.  But that may be OK given we're ridding tty of Giant.

Now, if we could only sort out the syscons/keyboard/mouse mess...

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Issue with huge numbers of connections

2007-06-17 Thread M. Warner Losh
Greetings,

I have a friend who is having problems with a service he's running.
He gets billions and billions of connections to this service a day.
Somewhere between 10^8 and 10^9 connections, he notices that his
servers lose the ability to accept new connections.  These are TCP
connections.

This is with FreeBSD 6.1R.  My first question is: does anybody know if
the fixes to -current/7.0 have fixed this?  Is there a fix that can be
back ported?  He's currently working around the problem by having a
number of different machines that reboot in a round robin fashion, but
would like a better solution.

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Issue with huge numbers of connections

2007-06-17 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Joe Holden [EMAIL PROTECTED] writes:
: M. Warner Losh wrote:
:  Greetings,
:  
:  I have a friend who is having problems with a service he's running.
:  He gets billions and billions of connections to this service a day.
:  Somewhere between 10^8 and 10^9 connections, he notices that his
:  servers lose the ability to accept new connections.  These are TCP
:  connections.
:  
:  This is with FreeBSD 6.1R.  My first question is: does anybody know if
:  the fixes to -current/7.0 have fixed this?  Is there a fix that can be
:  back ported?  He's currently working around the problem by having a
:  number of different machines that reboot in a round robin fashion, but
:  would like a better solution.
:  
:  Warner
:  ___
: Warner, if he hasn't done so already, have you suggested tweaking the
: sysctl variables, such as:
: kern.maxfilesperproc
: kern.ipc.nmbclusters
: kern.maxprocperuid
: kern.maxfiles
: kern.ipc.somaxconn
: kern.maxvnodes
: 
: Tweaking those may help, or he may just be exhausting available
: resources, IIRC its limited to 65k connections per interface, someone
: correct me if I am wrong.

Here's the bug report I got:

There is still a vague problem with the FreeBSD network interface --
especially the part that handles TCP. Something strange happens after
about a week or so (after handling about 10^8 or 10^9
connections). The system becomes unreachable for TCP connections. I
have fixed this problem by having all of the FreeBSD systems reboot
automatically once a week using a cron job. I have not been able to
isolate this issue, but I suspect that there is some kind of problem
with the error handling and some resource gets depleted slowly. I
realize that this is pretty vague, but I have not been able to find
out what actually happens in this case.

I believe that each connection lasts on the order of tens or hundreds
milliseconds, given what I know about the systems in place.  My earlier
rephrase omitted a few key points.  I suggested that he try to use a newer
version of FreeBSD, but since these are a production system, he's hesitant to
mess with them...

Doing the math on 10^9 connections in a week translates to ~1650/s, so we'd
expect there are on the order of 100-200 connections steady state at any
time.  I suspect that the peak load may be up to 100 times that, which is
still only 2 connections.  The hangs don't seem to hang at a peak, but
randomly.

Given all that, I'm not sure which of the above to try.

Warner


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD NFS Client, Windows 2003 NFS server

2006-12-08 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Harti Brandt [EMAIL PROTECTED] writes:
: On Thu, 7 Dec 2006, Harti Brandt wrote:
: 
: HBOn Thu, 7 Dec 2006, M. Warner Losh wrote:
: HB
: HBMWLIn message: [EMAIL PROTECTED]
: HBMWLHarti Brandt [EMAIL PROTECTED] writes:
: HBMWL: MWLDoes anybody have experience with using FreeBSD 4.x or 6.x NFS 
clients
: HBMWL: MWLagainst a Windows 2003 NFS server?  What is the performance 
relative
: HBMWL: MWLto using a FreeBSD NFS server?  What is the stability?  Does 
locking
: HBMWL: MWLwork?  Does the Windows 2003 server have extensions that grok 
file
: HBMWL: MWLsystem flags?
: HBMWL: 
: HBMWL: I use this regularily (well, -CURRENT). I have no numbers, but 
performance 
: HBMWL: is ok. I have the home directories on a W2003k server and it 'feels' 
fast 
: HBMWL: enough.
: HBMWL
: HBMWLWe see FreeBSD to FreeBSD NFS feeling fast enough for most things, but
: HBMWLwhen we do a full build of our system from scratch it takes 10 hours
: HBMWLover NFS vs 1 hour on a local disk.  We're worried that if we were to
: HBMWLtry to do heavy NFS traffic to a Win2003 server with SFU this would be
: HBMWLeven slower.
: HB
: HBOk. I did a very short test (no time to do much more). Read performance 
: HBwith dd if=/nfs/bigfile of=/dev/null bs=4k is around 9MByte/sec. Write 
: HBperformance with dd if=/dev/zero of=/nfs/bigfile bs=4k is 4MByte/sec.
: HB
: HBClient is something around 1GHz with a 100Mbps link. Fileserver is a 
: HBdouble proc Xeon with a 1Gbps link. The Server has a load of around 30% 
: HB(from the antivirus scanner).
: HB
: HB72Mbps on a 100Mbps link looks actually ok for me. I've no FreeBSD on a 
: HBGigabit link to test with.
: HB
: HBIf you want I could try to do a buildworld.
: 
: Ok. To answer my own mail. A buildworld with a local /usr/src takes 2:50h
: on that machine, with /usr/src on the W2003 server 3:50h. Looks not that bad.

So we're talking 33% slower here rather than 90% slower that I see for
my entire product build.  So the speed is similar to what I've seen
over NFS here...

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD NFS Client, Windows 2003 NFS server

2006-12-07 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Harti Brandt [EMAIL PROTECTED] writes:
: 
: Hi Warner,
: 
: On Wed, 6 Dec 2006, M. Warner Losh wrote:
: 
: MWLDoes anybody have experience with using FreeBSD 4.x or 6.x NFS clients
: MWLagainst a Windows 2003 NFS server?  What is the performance relative
: MWLto using a FreeBSD NFS server?  What is the stability?  Does locking
: MWLwork?  Does the Windows 2003 server have extensions that grok file
: MWLsystem flags?
: 
: I use this regularily (well, -CURRENT). I have no numbers, but performance 
: is ok. I have the home directories on a W2003k server and it 'feels' fast 
: enough.
: 
: The only problem I see is a lot of 'file server not reponding' and 'file 
: server up again' (with 2-3 seconds in between). This is usually when 
: saving a large mail from pine. Linux clients see the same problem, so I 
: suppose it is a problem on the SFU side. Locking seems to work. Problems 
: are with filenames that are illegal for NTFS - hosting a 2.11BSD source 
: tree on a W2003 NFS share does not work because of filenames containing 
: ':' :-). I've not tested what other characters are illegal.

This is excellent information.  So building a ports tree would be,
ummm, problematic.

: Another problem is that on the NTFS side there is no good way to backup, 
: copy, whatever the trees, because while NTFS handles Makefile and 
: makefile, no Windows tool can access both of them. Even worse thinks like 
: ADSM backup sometimes die with internal errors.

That's good information.

: Mapping of UIDs and GIDs is rather magic. The FreeBSD side, the SFU tools 
: and cygwin all see different numbers which is rather annoying. The same is 
: with symbolic links.

Also good information.

: The file flags are not supported by the server. There are no extensions 
: that I know of.

This is the one I knew about!  The others are far more important :-)

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD NFS Client, Windows 2003 NFS server

2006-12-07 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Harti Brandt [EMAIL PROTECTED] writes:
: MWLDoes anybody have experience with using FreeBSD 4.x or 6.x NFS clients
: MWLagainst a Windows 2003 NFS server?  What is the performance relative
: MWLto using a FreeBSD NFS server?  What is the stability?  Does locking
: MWLwork?  Does the Windows 2003 server have extensions that grok file
: MWLsystem flags?
: 
: I use this regularily (well, -CURRENT). I have no numbers, but performance 
: is ok. I have the home directories on a W2003k server and it 'feels' fast 
: enough.

We see FreeBSD to FreeBSD NFS feeling fast enough for most things, but
when we do a full build of our system from scratch it takes 10 hours
over NFS vs 1 hour on a local disk.  We're worried that if we were to
try to do heavy NFS traffic to a Win2003 server with SFU this would be
even slower.

: The only problem I see is a lot of 'file server not reponding' and 'file 
: server up again' (with 2-3 seconds in between). This is usually when 
: saving a large mail from pine. Linux clients see the same problem, so I 
: suppose it is a problem on the SFU side.

So building large binaries might be a problem?

: Locking seems to work.

Does seems to work mean it really does work, or does SFU just do the
old trick of saying 'ok, your lock worked'?

: Problems 
: are with filenames that are illegal for NTFS - hosting a 2.11BSD source 
: tree on a W2003 NFS share does not work because of filenames containing 
: ':' :-). I've not tested what other characters are illegal.

That would be a problem for hosting a ports tree on the NTFS nfs
partition, no?

: Another problem is that on the NTFS side there is no good way to backup, 
: copy, whatever the trees, because while NTFS handles Makefile and 
: makefile, no Windows tool can access both of them. Even worse thinks like 
: ADSM backup sometimes die with internal errors.

That's ugly.

: Mapping of UIDs and GIDs is rather magic. The FreeBSD side, the SFU tools 
: and cygwin all see different numbers which is rather annoying. The same is 
: with symbolic links.

Symblic links point elsewhere?  or have different destinations?  Does
it matter absolute or relative?

: The file flags are not supported by the server. There are no extensions 
: that I know of.

Same problem with FreeBSD to FreeBSD NFS.

Warner

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


FreeBSD NFS Client, Windows 2003 NFS server

2006-12-06 Thread M. Warner Losh
Does anybody have experience with using FreeBSD 4.x or 6.x NFS clients
against a Windows 2003 NFS server?  What is the performance relative
to using a FreeBSD NFS server?  What is the stability?  Does locking
work?  Does the Windows 2003 server have extensions that grok file
system flags?

Thanks much

Warner

P.S.  If this is the wrong place to ask, please suggest a better one.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: cvs commit: src/sys/net if_vlan.c

2006-07-04 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Brooks Davis [EMAIL PROTECTED] writes:
: and act as though the interface is not there.  We could then consider
: either holding the interface for a configurable or computed length
: of time or adding some sort of refcounting (probably impractical).

Refcounting would be good for the 'macro' things (coming and going)
that are infrequent, but we might have mulitple people doing.  You are
right it likely is too inefficient to do with mbugs.  One other option
might be to have a configurable time after the last time that it was
accessed via the 'safe' routines that were setup.  This way we'd tie
the removal of the interface to a period of time after it was last
used, rather than after it was removed.  I don't know if such a
difference would matter much in practice.

The only other 'issue' that I see with this approach is if I remove a
card, and then insert it again before the timeout happens.  Does that
card get a new interface name?  And would people care or not...

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: USB CDC ACM

2005-06-04 Thread M. Warner Losh
I beleive that umodem implements the usb cdc acm interface.

Warner
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: mem leak in mii ? (fwd)

2005-01-22 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Bjoern A. Zeeb [EMAIL PROTECTED] writes:
: * all PHY drivers currently seem to use mii_phy_detach for
:   device_detach. If any implements his own function it will be
:   responsible for freeing the ivars allocated in miibus_probe. This
:   should perhaps be documented somewhere ?

I think that the current patches are incorrect from a newbus point of
view.  They may solve the problem, but just smell wrong...

: 
: patch can also be found at
: http://sources.zabbadoz.net/freebsd/patchset/mii-memleaks.diff
: 
: 
: Index: mii.c
: ===
: RCS file: /local/mirror/FreeBSD/r/ncvs/src/sys/dev/mii/mii.c,v
: retrieving revision 1.20
: diff -u -p -r1.20 mii.c
: --- mii.c 15 Aug 2004 06:24:40 -  1.20
: +++ mii.c 23 Nov 2004 17:08:58 -
: @@ -111,7 +111,7 @@ miibus_probe(dev)
:   struct mii_attach_args  ma, *args;
:   struct mii_data *mii;
:   device_tchild = NULL, parent;
: - int bmsr, capmask = 0x;
: + int count = 0, bmsr, capmask = 0x;
: 
:   mii = device_get_softc(dev);
:   parent = device_get_parent(dev);
: @@ -145,12 +145,26 @@ miibus_probe(dev)
: 
:   args = malloc(sizeof(struct mii_attach_args),
:   M_DEVBUF, M_NOWAIT);
: + if (args == NULL) {
: + device_printf(dev, %s: memory allocation failure, 
: + phyno %d, __func__, ma.mii_phyno);
: + continue;
: + }
:   bcopy((char *)ma, (char *)args, sizeof(ma));
:   child = device_add_child(dev, NULL, -1);
: + if (child == NULL) {
: + free(args, M_DEVBUF);
: + device_printf(dev, %s: device_add_child failed,
: + __func__);
: + continue;
: + }
:   device_set_ivars(child, args);
: + count++;
: + /* XXX should we break here or is it really possible
: +  * to find more then one PHY ? */
:   }
: 
: - if (child == NULL)
: + if (count == 0)
:   return(ENXIO);
: 
:   device_set_desc(dev, MII bus);
: @@ -173,12 +187,15 @@ miibus_attach(dev)
:*/
:   mii-mii_ifp = device_get_softc(device_get_parent(dev));
:   v = device_get_ivars(dev);
: + if (v == NULL)
: + return (ENXIO);
:   ifmedia_upd = v[0];
:   ifmedia_sts = v[1];
: + device_set_ivars(dev, NULL);
: + free(v, M_DEVBUF);
:   ifmedia_init(mii-mii_media, IFM_IMASK, ifmedia_upd, ifmedia_sts);
: - bus_generic_attach(dev);
: 
: - return(0);
: + return (bus_generic_attach(dev));
:  }

newbusly, this is bogus.  device foo should never set its own ivars.
Nor should it ever get its own ivars to do anything with.  parent
accessor functions are needed here.

:  int
: @@ -186,8 +203,14 @@ miibus_detach(dev)
:   device_tdev;
:  {
:   struct mii_data *mii;
: + void*v;
: 
:   bus_generic_detach(dev);
: + v = device_get_ivars(dev);
: + if (v != NULL) {
: + device_set_ivars(dev, NULL);
: + free(v, M_DEVBUF);
: + }
:   mii = device_get_softc(dev);
:   ifmedia_removeall(mii-mii_media);
:   mii-mii_ifp = NULL;

Newbusly, this is incorrect.  The parent bus should be freeing the
ivars, since it is the one that should have put the ivars there in the
first place.

: @@ -305,12 +328,15 @@ mii_phy_probe(dev, child, ifmedia_upd, i
:   int bmsr, i;
: 
:   v = malloc(sizeof(vm_offset_t) * 2, M_DEVBUF, M_NOWAIT);
: - if (v == 0) {
: + if (v == NULL)
:   return (ENOMEM);
: - }
:   v[0] = ifmedia_upd;
:   v[1] = ifmedia_sts;
:   *child = device_add_child(dev, miibus, -1);
: + if (*child == NULL) {
: + free(v, M_DEVBUF);
: + return (ENXIO);
: + }
:   device_set_ivars(*child, v);
: 
:   for (i = 0; i  MII_NPHY; i++) {

This appears to be correct, because the ivars are set on the child
that's added.

: @@ -324,14 +350,22 @@ mii_phy_probe(dev, child, ifmedia_upd, i
:   }
: 
:   if (i == MII_NPHY) {
: + device_set_ivars(dev, NULL);
: + free(v, M_DEVBUF);
:   device_delete_child(dev, *child);
:   *child = NULL;
:   return(ENXIO);
:   }
: 
: - bus_generic_attach(dev);
: + i = bus_generic_attach(dev);
: 
: - return(0);
: + v = device_get_ivars(*child);
: + if (v != NULL) {
: + device_set_ivars(*child, NULL);
: + free(v, M_DEVBUF);
: + }
: +
: + return (i);
:  }

This appears to be correct, since it is the bus managing the child's
ivars.

:  /*
: Index: mii_physubr.c
: 

Re: cuaa0ttyd0's bug?

2004-05-19 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Bernd Walter [EMAIL PROTECTED] writes:
: On Tue, May 18, 2004 at 09:05:52PM +0800, wsk wrote:
:  hi,folks:
:  It seems that the ttyd0 isn't the dialin line to login , and
:  the cuaa0 like is both the dialin/out device!under 4.9 above
:  and 5.X .but the ttyd0 work well under 4.8.
:  here is my test:
:  I wanna direct connected two bsd box via a null-modem cable,on
:  a box ,I congfiured the ttys as follow:
:  # The 'dialup' keyword identifies dialin lines to login, fingerd etc.
:  ttyd0 /usr/libexec/getty std.9600 dialup on secure
:  ttyd1 /usr/libexec/getty std.9600 dialup off secure
:  and kill -HUP 1
:  on the other box , I use cu connenct the box:
:  cu -l /dev/cuaa0
:  connected
:  and then , nothing more i could recv!
:  but if I confirgure the ttys as below:
:  # The 'dialup' keyword identifies dialin lines to login, fingerd etc.
:  cuaa0 /usr/libexec/getty std.9600 dialup on secure
:  ttyd1 /usr/libexec/getty std.9600 dialup off secure
:  and do it like before i do , it worked!
:  why It's a bug???
: 
: The behavour is intentional.
: Your cable doesn't supply DCD signal.
: Either use a cable with DCD suport or stay with cuaa.
: It's OK to use cuaa if you don't need dialout support.

or turn off modem flow control in the lock device...

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: prism 2.5 timeout in wi_cmd 0x010b

2004-01-11 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Lorenzo Vicisano [EMAIL PROTECTED] writes:
: However after upgrding the Prism firmware (1.1.0 - 1.1.1)
: the lock up stopped.

You really really really want to be using 1.4.5 firmware.  Older
firmware is known to be buggy.

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: prism 2.5 timeout in wi_cmd 0x010b

2004-01-11 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Lorenzo Vicisano [EMAIL PROTECTED] writes:
: Warner,
: 
: On Sun, Jan 11, 2004 at 02:23:57PM -0700, M. Warner Losh wrote:
:  In message: [EMAIL PROTECTED]
:  Lorenzo Vicisano [EMAIL PROTECTED] writes:
:  : However after upgrding the Prism firmware (1.1.0 - 1.1.1)
:  : the lock up stopped.
:  
:  You really really really want to be using 1.4.5 firmware.  Older
:  firmware is known to be buggy.
: 
: I was referring to the primary, as for the station
: I went from 1.4.2 to 1.7.4 .. but does the station
: firmware affect the functionality as a client anyways ?

Yes.  It does.  I've had reports of problems with newer versions of
the firmware...  I've been too swamped to try it out.  The newer
versions implement AP to AP junk, as well as WPA enhancements...

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: finishing the if.h/if_var.h split

2003-09-29 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Brooks Davis [EMAIL PROTECTED] writes:
: Six years and eight months ago, net/if.h was split into if.h and
: if_var.h.  At the time, if_var.h was included at the end if if.h as
: follows (this is the current code, but it's equivalent):
...
: Does this sound reasonable?

I'd go ahead and finish the split.

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SMC 2602W PCI Wireless

2003-09-22 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Daniel Dias Goncalves [EMAIL PROTECTED] writes:
: The device SMC 2602W PCI works in the FreeBSD?
: vendor   = 'Admtek Inc'

unlikely.  The adm wireless driver is still being ported.

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Netgear MA401 stopped working

2003-09-02 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Nathan Clark [EMAIL PROTECTED] writes:
: I have an IBM Thinkpad 600X which dual boots 5.1-Current and XP-Pro.  I
: purchased a Netgear MA401 wireless 802.11b card which worked fine under
: both OS's for about a week.  This past Saturday, however, I was unable
: to connect to websites, my mail server etc.  The symptoms are the same
: under both OS's: After an extended time of waiting, I eventually resolve
: the host, seem to be sending packets, but never receive anything back. 

Both OSes doing the same thing indicates a problem with the card, or a
problem with the AP.

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ifconfig question

2003-03-30 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Wes Peters [EMAIL PROTECTED] writes:
: Only if the spec says those are the only valid ranges.  Then we have to 
: keep up to date with changes in the spec, too.  Either some simple sanity 
: checks or checking for truly valid lengths -- 0, 40 bits, 128 bits -- 
: makes even better sense.

128 bits isn't a valid length either.  Only 40 and 104 bits, which we
currently semi-bogusly report as 64-bit and 128-bit.

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


ifconfig question

2003-03-29 Thread M. Warner Losh
The code that prints out the keys for the 802.11 wireless stuff has
the following it it:

void
ieee80211_status (int s, struct rt_addrinfo *info __unused)
{
...
if (ireq.i_len == 0 || ireq.i_len  13)
continue;
...
}

Should that check really be there?  Newer wep does 256bits  Not
that the rest of the code supports that, but I was just curious.

Second, should ifconfig report the wep key if run as root?  wicontrol
does if it is run as root, for example.  Any objections for fixing
this?

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Proper -current if_attach locking?

2003-01-07 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Terry Lambert [EMAIL PROTECTED] writes:
: M. Warner Losh wrote:
:  In message: [EMAIL PROTECTED]
:  Nate Lawson [EMAIL PROTECTED] writes:
:  : I was looking into some could sleep messages and found some bogus
:  : locking in the attach routine of many drivers.  Several init a mtx in
:  : their softc and then lock/unlock it in their attach routine.  This, of
:  : course, does nothing to provide exclusive access to a device.  I assume
:  : there is going to be a global IF_LOCK or something to be used in attach
:  : routines.  Can someone fill me in on the intended design?
:  
:  The locking in the attach routines is generally bogus.  Locking is
:  only needed when you have more than one thread of execution.  You
:  don't have more than one thread of execution until after you establish
:  the ISR and turn on interrupts.  We should likely not be enabling
:  interrupts until very late in the attach routine so that we don't need
:  any locking in them.
: 
: I looked at this.  It seems to me that it's not quite that
: simple (sorry).  I think that there are issues with locking
: because you don't know if this is a driver that's getting
: loaded well after the system has booted, or if this is a
: PCCARD or other hot plug device that has just arrived in
: the system.

That doesn't mattar at all.  If it is a new device that's just
arrived, the attach still won't be interrupted *by other code in the
driver* until after it has setup its ISR and told the hardware to
start generating interrupts.  No device locking is needed in the
attach routine until after interrupts are enabled in the hardware.

: It also seems to me that the reversal problems that would
: result by simply inserting locking have to do with the fact
: that the code is relatively schitzophrenic on deciding whether
: it's locking data, or it's locking a critical path.

The reversal is because of the bogus locking.  The first time through
it locks the device then the interface.  However, after that it locks
the interface and then the device, which can be bad.  It does point to
a problem, however, in that sometimes we'll take out the locks in one
order and other times other orders depending on the code path if we
aren't careful.  I should go look at the new code more closely.

I worry that in the non interrupt case we get things in the IF, DEV
order (because the IF locks, then calls the callback routines, which
then call the DEV lock).  But in the interrupt case we get the DEV
lock first, then try to queue data and that somehow causes the IF
locks to be grabbed.

But you are right, I do need to go look at the code to see what,
exactly, is happening and how the new interface locking code is
interacting with the semi-bogus locking than many of the wpaul drivers
have in them now.

: I can't be the only one who finds FreeBSD 5.x to be in such a state
: of flux that it's almost impossible to know what a correct
: implementation is supposed to look like, for a given subsystem
: and/or device driver, list, etc..

There we agree.  It takes a lot to keep up, and even then I fall
behind when something new happens behind my back.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: Proper -current if_attach locking?

2003-01-07 Thread M. Warner Losh
I was right (and I think you are too).  We do have lock issues.

dc_attach does approximately:

DC_LOCK
ether_attach()  (which does a IFNET_WLOCK/UNLOCK pair)
DC_UNLOCK

(this sets the lock order to be DC_LOCK, IFNET_WLOCK).

However in if_slowtimo we have:

if_slowtimo(arg)
{
... IFNET_RLOCK();
... if (ifp-if_watchdog)
(*ifp-if_watchdog)(ifp);
... IFNET_RUNLOCK();
}

and dc_watchdog does a DC_LOCK/UNLOCK pair).  This is a Lock Order
Reversal, and not a LotR :-)

What's worse is that dc_intr does:

DC_LOCK
...dc_start (which calls IF_PREPEND which does the IFNET_LOCK/UNLOCK thing)
DC_UNLOCK

So even if we remove the one from attach, it looks like we have others
lurking in the code.

Either that, or it is too late for me to be looking at code like this
:-(

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: Proper -current if_attach locking?

2003-01-07 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Andrew Gallatin [EMAIL PROTECTED] writes:
: 
: M. Warner Losh writes:
:   In message: [EMAIL PROTECTED]
:   Andrew Gallatin [EMAIL PROTECTED] writes:
:   : The IFNET_RLOCK() called in if_slowtimo() is a global lock for the
:   : list of ifnet structs to ensure that no devices are removed or added
:   : while something may be using it.  There is one ifnet list in the system.
:   
:   So this means that only the locking in attach is bogus, and similar
:   locking in detach is also bogus because they produce lock order
:   reversals as the global lock is held to insert/remove if interfaces.
: 
: Yes.  Though I haven't looked at if_dc myself, there may be other
: locking problems.  I've only been commenting on the ones that you
: brought up.
: 
: But back to an earlier point.  Somebody (you?) validly pointed out
: that the driver should not be callable and should not generate
: interrupts until its finished attaching.  The lock in its attach was
: probably a somewhat misguided attempt at that.  

Yes.  That was me.  There are some drivers that have separated
front/back ends that makes this harder, but most of them don't.

: The first point can be accomplished by doing the ether_ifattach()
: last, but the second may be harder.  I do that by poking a bit on my
: card which prevents it from generating interrupts while the device is
: being setup.  Not sure if a similar bit exists on tulip cards.

All PCI cards have to be able to turn off their interrupt sources,
otherwise interrupt storms result.  At least that's my understanding.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: Proper -current if_attach locking?

2003-01-06 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Nate Lawson [EMAIL PROTECTED] writes:
: I was looking into some could sleep messages and found some bogus
: locking in the attach routine of many drivers.  Several init a mtx in
: their softc and then lock/unlock it in their attach routine.  This, of
: course, does nothing to provide exclusive access to a device.  I assume
: there is going to be a global IF_LOCK or something to be used in attach
: routines.  Can someone fill me in on the intended design?

The locking in the attach routines is generally bogus.  Locking is
only needed when you have more than one thread of execution.  You
don't have more than one thread of execution until after you establish
the ISR and turn on interrupts.  We should likely not be enabling
interrupts until very late in the attach routine so that we don't need
any locking in them.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: ep(4) does not support mediaopt full-duplex

2002-12-30 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Eugene Grosbein [EMAIL PROTECTED] writes:
: Is there a good reason for
: not supporting full-duplex? 

I don't think 3c589 supports full duplex at the hardware level.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: ep(4) does not support mediaopt full-duplex

2002-12-30 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Mike Silbersack [EMAIL PROTECTED] writes:
:  So I conclude 3C589D does support full duplex at the hardware level.
: 
: You say you're using a 3C589D, which is detected by if_ep_pccard.c as
: such:
: 
: case 0x9058: /* 3C589 */
: return (3Com Etherlink III 3C589);
: 
: However, in if_xlreg.h, I see us define the following:
: 
: #define TC_DEVICEID_CYCLONE_10_100_COMBO0x9058
: 
: Are these actually both the same card, or are PCI and PCCARD IDs
: unrelated?  If they are the same card, then clearly the solution is to
: make if_xl work with the card instead of if_ep. :)

These aren't the same.

The vx driver should really be if_ep_pci acccording to Matt Dodd.
if_xl might also be a if_ep_pci driver, but I'm not 100% about that.
Matt knows for sure.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: ep(4) does not support mediaopt full-duplex

2002-12-30 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Mike Silbersack [EMAIL PROTECTED] writes:
: From what I've read, the vortex cards do resemble the older Etherlink
: IIIs, so yes those two could probably be merged.

They are merged in NetBSD.  I think Matt Dodd has some patches that do
this.

: However, the 3c900+
: cards have a true bus mastering PCI architecture, so it makes sense to fit
: them into a seperate driver, such as if_xl.

Just because it is true bus mastering doesn't mean that it makes sense
to have a different driver...  I haven't looked at the structure of
these two drivers to know if this makes sense or not.  I was mostly
speculating out loud...

: Here's another thing I'm confused about:
: 
: from if_xlreg.h:
: 
: #define TC_DEVICEID_HURRICANE_556   0x6055
: 
: from if_ep_pccard.c:
: 
: case 0x6055: /* 3C556 */
: 
: Is it possible that 3Com used the same chip in mini-pci and pccard
: designs?  This does seem possible, as 3c905 (pre-B, I don't know about the
: mini-PCI version) cards support the 3c509 interface.

Likely just a compat thing.  The 3com interfaces seem to have lots
backward compat stuff (did you know you can get the EISA id out of
pcmcia devices, at least the 3c1?)

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: Netgraph and KQUEUE(2)

2002-11-06 Thread M. Warner Losh
: 1) Device driver in Netgraph node. When hardware is
:activated new Netgraph node is created and new
:kevent sent. devd (or something like devd) listens
:for these events and does something (loads firmware,
:activates device, etc.)

Device drivers are not netgraph nodes.  They will have a device_t
associated with them, which already sends a message via /dev/devctl to
devd.  You can do anything you want with the results.  There's no need
to reinvent the wheel that I'm almost done inventing.  There's
absolutely no need to bring netgraph into it all, and doing so makes
it a less generic implementation.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: Netgraph and KQUEUE(2)

2002-11-06 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Julian Elischer [EMAIL PROTECTED] writes:
: 
: 
: On Wed, 6 Nov 2002, M. Warner Losh wrote:
: 
:  : 1) Device driver in Netgraph node. When hardware is
:  :activated new Netgraph node is created and new
:  :kevent sent. devd (or something like devd) listens
:  :for these events and does something (loads firmware,
:  :activates device, etc.)
:  
:  Device drivers are not netgraph nodes.  They will have a device_t
:  associated with them, which already sends a message via /dev/devctl to
:  devd.  You can do anything you want with the results.  There's no need
:  to reinvent the wheel that I'm almost done inventing.  There's
:  absolutely no need to bring netgraph into it all, and doing so makes
:  it a less generic implementation.
: 
: devices that are netgraph nodes may not have any entry in /dev
: and might only appear in  the netgraph namespace..
: e.g. if_ar.c if_sr.c

It doesn't matter.  *ALL* devices have device_t entries.  Recall that
device_t is not dev_t.  dev_t appears in /dev/.  Hardware devices have
to attach to some bus.  That's why devd is done in newbus land rather
than in dev_t land.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: Comments Please

2002-10-12 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Luigi Rizzo [EMAIL PROTECTED] writes:
: On Sat, Oct 12, 2002 at 05:18:09PM -0600, M. Warner Losh wrote:
:  OK.  I'm not a network wonk, so I thought I'd run this by people
:  here.  What do people think.
: 
: sounds ok -- removing explicit constants is always good.
: On passing:
: 
:   * While you are at it,
:   grep etherbroadcastaddr sys/net*/*
: reveals the use of an explicit constant (6) in net/if_arp.h and
: netinet/if_ether.c; there is more of the same in net/bridge.c
: (my fault), net/if_atmsubr.c, netinet/if_ether.c, netncp/ncp_subr.c

atmsubr?  Doesn't ATM have its own constants?

:   * there is no real reason to have etherbroadcastaddr as a
: variable. net/bridge.c has a macro, IS_ETHER_BROADCAST,
: which is much faster to evaluate on i386, and
: could be moved e.g. in net/ethernet.h and be used
: to check for ethernet broadcast addresses in
:   net/if_ethersubr.c
:   net/if_iso88025subr.c
:   netatalk/aarp.c
:   net/if_fddisubr.c
: This only leaves some usages of etherbroadcastaddr is in
: netinet/if_ether.c to set the address for outgoing broadcast
: packets.

I'll let others deal with that.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: Comments Please

2002-10-12 Thread M. Warner Losh
In message: [EMAIL PROTECTED]
Luigi Rizzo [EMAIL PROTECTED] writes:
: On Sat, Oct 12, 2002 at 08:07:47PM -0600, M. Warner Losh wrote:
: ...
:  : reveals the use of an explicit constant (6) in net/if_arp.h and
:  : netinet/if_ether.c; there is more of the same in net/bridge.c
:  : (my fault), net/if_atmsubr.c, netinet/if_ether.c, netncp/ncp_subr.c
:  
:  atmsubr?  Doesn't ATM have its own constants?
: 
: eh, that's the problem with explicit constants, you can never tell
: whether 6 is english, german or italian... in any case the
: relevant piece of code is:
: 
:   net/if_atmsubr.c:   if (bcmp(alc, ATMLLC_HDR, 6)) {
: 
: I have no idea if it has any relation with ethernet header sizes.

Looks like that one should be something else, since it is an atm llc
header.  Of course the atm code should be using if_llc.h to get this
stuff, but that's another story...

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: Multicast problem with wi driver in promiscuous mode - anyresolution?

2002-05-21 Thread M. Warner Losh

I don't think anybody has applied fixes to the wi driver in that time
frame for this purpose.  Have fun :-(.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: 802.11: WaveLAN/Orinoco Cards

2002-05-06 Thread M. Warner Losh

In message: 005b01c1f4db$e3563f20$020a@bender
Martin Minkus [EMAIL PROTECTED] writes:
: But it's a standard WaveLAN/Orinico card, which is what the wi driver is
: intended for?
: 
: I never had to worry about any of this when I had the old white/bronze
: 2mbit wavelan cards, but with silver and gold cards, its been nothing
: but fun and games

Yea.  Terry is wrong here. Ignore what he says, for he knowest not
what he talkest about.

The wi driver might be getting some of them wrong, but it is
impossible to say because you didn't include the version you were
using (there was a bug releated to this fixed in the not too distant
past).

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: 802.11: WaveLAN/Orinoco Cards

2002-05-06 Thread M. Warner Losh

In message: [EMAIL PROTECTED]
Terry Lambert [EMAIL PROTECTED] writes:
: Martin Minkus wrote:
:  Perhaps when I have some spare time I can go look into the wi driver.
:  And perhaps your right, firmware changes on the orinoco cards are the
:  cause of this; I have flashed mine to 8.1 (or whatever the latest
:  firmware is, 8.something). My white wavelan cards were originally
:  firmware 1.0 when I got them :)
: 
: 
: Actually, it appears I'm wrong, and you just haven't read the
: message yet.  Apparently there have been some commits which
: fix your problem for you (though they may be limited to -current).

Nope.  They have been MFC'd as of April 30th or so.  The entire wi
driver was back merged at that date.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



MFC: IF_HANDOFF and IF_HANDOFF_ADJ

2002-04-24 Thread M. Warner Losh

Any objections to MFCing IF_HANDOFF and IF_HANDOFF_ADJ?  I'd like
there to be a common API between -stable and -current.

I'm thinking that the following macros would be sufficient for -stable:

#define IF_HANDOFF_ADJ(q, m, ifp, adj) \
if (IF_QFULL((q))) { \
IF_DROP((q)); \
m_freem((m)); \
} else { \
(ifp)-if_obytes += (m)-m_pkthdr.len + (adj); \
if ((m)-m_flags  M_MCAST) \
(ifp)-if_omcasts++; \
IF_ENQUEUE((q), (m)); \
if (((ifp)-if_flags  IFF_OACTIVE) == 0) \
(*(ifp)-if_start)((ifp)); \
}
#define IF_HANDOFF(q, m, ifp) IF_HANDOFF_ADJ(q, m, ifp, 0)

Comments?

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message



Re: network buffer problem

2002-02-18 Thread M. Warner Losh

Yes.  I was going to commit this fix to -stable at bsdcon, but the
number of problem laptops that I wanted to look at closely didn't
allow it.

Warner

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-net in the body of the message