Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior

2019-01-20 Thread Martin Birgmeier
Hi Bruce,

Thank you for your support.

The machine A with the em0 issue is running at 1 Gbps and acts as NFS
server. The NFS client B has a 100 Mbps interface. B gets a throughput
of only 1 Mbyte/s when talking to A but the full 10 Mbyte/s when talking
to another third machine C. In addition, while B is talking to A, if at
the same time A runs an iperf to C, the situation for B improves (up to
5..7 Mbyte/s).

All machines are connected by a DGS-1210-24 1 Gbps switch.

In the mailing list and FreeBSD bugs I have seen that there are a
multitude of issues with the em driver in FreeBSD 12. It seems that the
switch to iflib has introduced them.

I have also discovered that there is net/intel-em-kmod. What is the
relationship between the driver in the base sources and this one? How
advisable is it to use the driver from ports?

-- Martin

On 20.01.19 13:56, Bruce Evans wrote:
> On Sun, 20 Jan 2019, Martin Birgmeier wrote:
>
>> Regarding duplex, ifconfig shows the following:
>>
>> [0]# ifconfig em0
>> em0: flags=8843 metric 0 mtu
>> 1500
>>    
>> options=81249b
>>
>>     ether f0:de:f1:98:86:a9
>>     inet 192.168.1.19 netmask 0xff00 broadcast 192.168.1.255
>>     inet6 fe80::f2de:f1ff:fe98:86a9%em0 prefixlen 64 scopeid 0x1
>>     inet6 fec0:0:0:4d42::13 prefixlen 64
>>     inet6 fec0::4d42:f2de:f1ff:fe98:86a9 prefixlen 64 autoconf
>>     inet6 2002:bc17:f381:4d42:f2de:f1ff:fe98:86a9 prefixlen 64
>> autoconf
>>     media: Ethernet autoselect (1000baseT )
>>     status: active
>>     nd6 options=23
>> [0]#
>>
>> This seems to be o.k.
>
> The media setting can't be trusted to have reached the hardware -- see my
> previous reply.
>
> But I thought that you said that you were using 100 Mbps (presumably with
> autoselect).  The above shos autoselect giving 1 Gbps.
>
> I checked that iflib_media_change() is not called for autoselect to 1
> Gbps
> here.  Also that it fails to stop the NIC if called.  Also that it breaks
> the NIC's state after a few calls in the loop:
>
> while :; do
>     ./ifconfig em0 media 1000baseT mediaopt full-duplex
>     ./ifconfig em0 media autoselect
> done
>
> provided ./ifconfig is on nfs.  This gives null changes disguised as
> non-null changes so that iflib_media_change() is called.
>
> Console output for this:
>
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX em0: TX(0) desc avail = 21, pidx = 34
>
> Sometimes the queue indexes are corrupted and this messages is printed.
> Sometimes, but never in this output, this message is repeated many times
> before the interface comes back up.  Actually, this doesn't always
> occur between down and up, and when it is repeaded the queue state is
> avail = 1024, pidx = 0, and this state seems to be sticky unless ifconfig
> somehow runs to generate another reinitialization.
>
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX em0: TX(0) desc avail = 1, pidx = 30
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX Link state changed to up
> XX link state changed to down
> XX em0: TX(0) desc avail = 14, pidx = 33
> XX Link state changed to up
>
> ipv4 ping is broken most of the time while this loop is running.  Of
> course
> ping should stop responding while the interface is down.  It rarely
> starts
> when the interface comes back up.  Sometimes it starts with low latency,
> but usually it starts with DUPs.  For about 50 iterations, the only ping
> output was:
>
> XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=0.158 ms
> XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=3523.305 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=6696.247 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=619 ttl=64 time=9857.912 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=0.094 ms
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=4154.124 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=7253.986 ms (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=10367.938 ms
> (DUP!)
> XX 64 bytes from 192.168.2.8: icmp_seq=728 ttl=64 time=13540.805 ms
> (DUP!)
>
> Bruce
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior

2019-01-20 Thread Martin Birgmeier
Regarding duplex, ifconfig shows the following:

[0]# ifconfig em0
em0: flags=8843 metric 0 mtu 1500
   
options=81249b
    ether f0:de:f1:98:86:a9
    inet 192.168.1.19 netmask 0xff00 broadcast 192.168.1.255
    inet6 fe80::f2de:f1ff:fe98:86a9%em0 prefixlen 64 scopeid 0x1
    inet6 fec0:0:0:4d42::13 prefixlen 64
    inet6 fec0::4d42:f2de:f1ff:fe98:86a9 prefixlen 64 autoconf
    inet6 2002:bc17:f381:4d42:f2de:f1ff:fe98:86a9 prefixlen 64 autoconf
    media: Ethernet autoselect (1000baseT )
    status: active
    nd6 options=23
[0]#

This seems to be o.k.

-- Martin

On 20.01.19 06:28, Bruce Evans wrote:
> On Sat, 19 Jan 2019, Martin Birgmeier wrote:
>
>> I just tried the patch by Bruce (from the mail sent 10 hours ago), but
>> it makes no difference.
>>
>> Also, it does not seem like bad frames or too high an interrupt rate are
>> the problem (the machine should easily handle what is coming from its
>> NFS client which only has a 100 Mbps interface).
>>
>> I believe that the simplifications introduced to sys/dev/e1000 between
>> 11.2 and 12.0 have broken something.
>
> They aren't exactly simplifications :-).
>
> Did you check for the common problex of a duplex mismatch?  ISR that some
> versions if iflib'ed em didn't negotiate right for your speed of 100
> Mbps.
>
> Here I can break nfs using "ifconfig em0 media 100baseTX mediaopt
> full-duplex" and forgetting the mediaopt part.  This gives half-duplex.
> ipv4 ping still works, but its latency increases from ~125 usec to ~76
> msec.  The latter latency destroys nfs performance.  After the media
> change, there are a lot of DUP packets with an initial latency of ~43
> second and the latency decreasing by the ping interval of 1 second for
> the next 42 or 43 DUPs until the backlog is cleared; the latency is
> then between 71 and 80 msec.  Changing the media and mediaopt back
> to 1000baseT[X] full-duplex restores low latency but causes 1 DUP with
> delay ~19 seconds
>
> Suspend/resume used to give much the same misbehaviour, by not stopping
> the NIC when reinitializing it in resume.  This was fixed in r342855.
> This might be the bug!  iflib_media_change() calls iflib_init_locked()
> liked resume used to, so seems to be missing stopping.  Changing this
> should fix at least the DUPs.
>
> The function names or layering are confusing.  iflib_init_locked()
> doesn't initialize the if.  iflib_if_init_locked() does that.  All
> iflib_init_locked() does is call iflib_stop(), then iflib_init_locked().
> and iflib.  Grep shows the following related iflib*init*() calls:
> - iflib_netmap_register manually inlines iflib_if_init_locked().  This
>   is a style bug
> - iflib_media_change() only calls iflib_init_locked().  This seems to be
>   a bug
> - _task_fn_admin() calls iflib_if_init_locked() for resetting.  This
> seems
>   to be correctly obfuscated
> - iflib_if_init_locked() calls iflib_init_locked().  This is part of
>   implementing the obfuscation - iflib_if_init() calls
> iflib_if_init_locked().  This is correct
> - iflib_if_ioctl(): SIOCSIFMTU calls iflib_stop(), then does some
> locking,
>   then sets the mtu in software, then calls iflib_init_locked().  This
>   seems to be correct, and shows that the iflib_if_init_locked() is not
>   even generally useful.  This gives down/up for non-null changes.  This
>   works correctly (some ping packets are lost, but there are no DUPs.
> - iflib_if_ioctl(): SIOCSIFCAP is like SIOCSIFMTU, except I didn't test
>   it and its splitting of stopping and init'ing is a bit messier because
>   both operations are under a more complicated conditional.
> - iflib_if_ioctl(): calls iflib_if_init().  This
>   is correct.
> - iflib_vlan_[un]register() call iflib_if_init_locked().  This seems
> to be
>   correctly obfuscated
> - iflib_device_resume() calls iflib_if_init_locked().  This is correctly
>   obfuscated
> - if_setinitfn() is called to set iflib_if_init as the init function. 
> This
>   is correct.
>
> Summary: only media change seems to be broken, but there are some
> style bugs.
>
> The bug apparently btoke resume by reinitializing an active state
> (even locking doesn't help much, but I now remember than resume
> succeeded every 10-100 tries in the buggy versions -- there were always
> a lot of DUPs, but sometimes to low latency came back).  My tests
> usually used zzz and my zzz and other utilities are on nfs, so nfs was
> fairly active just before suspend.
>
> I don't know if iflib_media_change() is called at boot time, especially
> if the media is autoselect.  At boot time, the state might be less
> active or closer to the reset state, so that even a manual media change
> that surely calls 

Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior

2019-01-20 Thread Martin Birgmeier
I am not using resume at all... just normal startup/shutdown.

-- Martin

On 20.01.19 07:19, Bruce Evans wrote:
> On Sun, 20 Jan 2019, Bruce Evans wrote:
>
>> [iflib_media_change() is missing iflib_stop(), like iflib_resume() was]
>>
>> I don't know what the media was after the broken resume.  Its reported
>> result can't be trusted anyway.  To recover from the broken resume, it
>> usually worked to repeat down/up a few times.  This is consistent with
>> bug -- eventually, previous down/up's change the state to close enough
>> to stopped.  But using the interface in any way (including pinging it
>> to see if it is still broken) makes it not so close to being stopped.
>
> Further debugging after restoring the bug in resume:
> - I use mainly zzz to suspend
> - the bug usually doesn't break the interface if I copy zzz from nfs to
>   non-nfs and use the copy.  This explains why almost no one except me
>   noticed the bug -- zzz is usually not on nfs, and other nfs activity
>   is usually lighter than mine too.  (Suspend apparently doesn't do
> enough
>   stopping or syncing generally.  It should fsync() all files ...)
> - the bug usually does break the interface if zzz is on nfs
> - when the bug breaks the interface:
>   - the media is reported as unchanged
>   - after DUPs starting with a delay of many seconds and reducing by the
>     ping interval of 1 second for each until the delay is less than 1
>     second, the ping latency stabilizes at quite different values after
>     each suspend/resume.  These values tend to be higher than for media
>     change (several hundred ms instead of 76 ms).
>   - my ifconfig excutable is one of several under /sbin which is not
> on nfs,
>     but my ifconfig is actually a shell script in $HOME/bin; the script
>     selects the correct version of ifconfig for the current kernel; it is
>     on nfs, and uses utilties on nfs.  I sometimes forget this, and then
>     running plain ifconfig to attempt to recover takes too long, and if I
>     wait then the nfs activity for finding ifconfig not on nfs tends to
>     propagate the broken interface (like zzz not on nfs breaks it).
>     Manually selecting the correct version of ifconfig under /sbin and
> using
>     it tends to work right (like zzz not on nfs).
>   - even an mtu change is enough to recover.  This is not surprising,
> since
>     it does slightly more than down/up as an implementation detail.  This
>     shows that the reported media value is at least used by the reinit
> for
>     the mtu change.
>   - pinging the interface didn't make it active enough for the
> recovery to
>     not usually work.
>
> Bruce
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: [Bug 235031] [em] em0: poor NFS performance, strange behavior

2019-01-19 Thread Martin Birgmeier
I just tried the patch by Bruce (from the mail sent 10 hours ago), but
it makes no difference.

Also, it does not seem like bad frames or too high an interrupt rate are
the problem (the machine should easily handle what is coming from its
NFS client which only has a 100 Mbps interface).

I believe that the simplifications introduced to sys/dev/e1000 between
11.2 and 12.0 have broken something.

-- Martin

On 19.01.19 21:06, Bruce Evans wrote:
> On Sun, 20 Jan 2019, Eugene Grosbein wrote:
>
>> 19.01.2019 17:21, Bruce Evans wrote:
>>
>>> Your problem looks more like lost interrupts.  All em NICs should
>>> interrupt
>>> at the default interrupt moderation rate of 8 kHz under load.  Once
>>> there
>>> are are that many interrupts, there is not much else that can go
>>> wrong (nfs
>>> would have to be working to generate that many interrupts).
>>
>> I have a patch (in production since 8.x) that makes em(4) support
>> hw.em.max_interrupt_rate
>> just like igb(4) supports hw.igb.max_interrupt_rate:
>>
>> http://www.grosbein.net/freebsd/patches/em_sysctl-11.0.diff.gz
>>
>> It also brings in sysctls dev.em.X.max_interrupt_rate and
>> hw.em.max_interrupt_rate sets defaults for them.
>
> This is inverted and spelled dev.em.X.itr for em.
>
> Hmm, em already has this, but it is only a read-only tunable.
>
> igb seems to have gone away.  In FreeBSD-11, its
> dev.em.X.max_interrupt_rate
> is also only a tunable.
>
> I use the variants of the following fix for itr in FreeBSD-[7-13]
>
> XX Index: if_em.c
> XX ===
> XX --- if_em.c    (revision 332488)
> XX +++ if_em.c    (working copy)
> XX @@ -908,10 +910,10 @@
> XX  E1000_REGISTER(hw, E1000_TADV),
> XX  em_tx_abs_int_delay_dflt);
> XX  em_add_int_delay_sysctl(adapter, "itr",
> XX -    "interrupt delay limit in usecs/4",
> XX +    "interrupt delay limit in usecs",
> XX  >tx_itr,
> XX  E1000_REGISTER(hw, E1000_ITR),
> XX -    DEFAULT_ITR);
> XX +    100 / MAX_INTS_PER_SEC);
> XX XX  hw->mac.autoneg = DO_AUTO_NEG;
> XX  hw->phy.autoneg_wait_to_complete = FALSE;
>
> This fixes the description and the initial value for the sysctl to match
> the code.  The description almost matches the buggy initial value.  The
> hardware has power of 2 units, but the code scales to microseconds. 
> Except
> the initial value has was in hardware units scaled by another power of 2
> which made them nearly microseconds/4.  The code sets the initial
> value to
> a representation of 125 usec (8 kHz), but the sysctl says that the
> initial
> value is 488 and the description says that this is a representation of
> 488/4 = 122 usec.  However, writing back this value using sysctl gives
> 488 usec (~2 kHz).  The magic number 122 is 125 mis-scaled by 1000/1024.
>
> FreeBSD[7-10] have lem in a separate file with the bug duplicated, so
> need the patch duplicated.  FreeBSD[7-8] don't have a sysctl for this.
> They default to 125 usec and there is no way to see or change the value.
> I usually want the smaller value of 0, and hard-code this when there is
> no sysctl.
>
> DEFAULT_ITR is used mainly to obfuscate this.  IGB_DEFAULT_ITR and
> IGB_LINK_ITR are also defined, but are not used even in versions of
> FreeBSD
> that have igb.
>
>> I use hw.em.max_interrupt_rate=32000 for 1GB link passing average
>> sized packets
>> (about 600 bytes per packet at average) but driver's default 8000
>> should be nearly fine
>> for full size packets (1500 or above) and this 8000 limit cannot be
>> reason for such low throughput.
>
> 0 for itr maxes out at about 100 kHz here.  This is good for low
> latency with
> small packets.
>
> My version of bge dynamically modifies the rate to match the rx load (no
> moderation for light loads).  tx is handled specially and only needs 1
> interrupt every few seconds for freeing resources.
>
> Bruce
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Tying down network interfaces

2014-12-31 Thread Martin Birgmeier
The devices are PCI cards, no USB is involved.

I assume that I'd have to add lines similar to
hint.sis.0.at=pci0:9:0
to /boot/device.hints, but I am unsure of the correct syntax.

See also these old articles (which ultimately seem to have gone
unanswered):
http://lists.freebsd.org/pipermail/freebsd-questions/2009-January/190453.html
,
http://lists.freebsd.org/pipermail/freebsd-questions/2009-January/190624.html

-- Martin

On 12/30/14 21:13, Freddie Cash wrote:

 On Dec 30, 2014 10:02 AM, Martin Birgmeier la5lb...@aon.at
 mailto:la5lb...@aon.at wrote:
 
  Hi,
 
  I have two network interfaces as follows:
 
  sis0: NatSemi DP8381[56] 10/100BaseTX port 0xa400-0xa4ff mem
  0xd580-0xd5800fff irq 9 at device 9.0 on pci0
  sis1: NatSemi DP8381[56] 10/100BaseTX port 0x9400-0x94ff mem
  0xd480-0xd4800fff irq 11 at device 12.0 on pci0
 
  When sis0 breaks down, sis1 gets renumbered as sis0, wreaking havoc
  (mostly on my brains until I figure out which card is actually
 affected).
 
  How do I tie down these two interfaces so that they always stay as sis0
  and sis1, respectively, regardless of which ones are present in the
  system? - I expect to insert something into /boot/device.hints.

 There was a recent thread on one of the lists about using devd to name
 USB Ethernet devices based on their MAC or serial number. Something
 like that should be useful for naming NICs something constant.

 There's also a bug report for it with a working solution.

 Cheers,
 Freddie


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Tying down network interfaces

2014-12-30 Thread Martin Birgmeier
Hi,

I have two network interfaces as follows:

sis0: NatSemi DP8381[56] 10/100BaseTX port 0xa400-0xa4ff mem
0xd580-0xd5800fff irq 9 at device 9.0 on pci0
sis1: NatSemi DP8381[56] 10/100BaseTX port 0x9400-0x94ff mem
0xd480-0xd4800fff irq 11 at device 12.0 on pci0

When sis0 breaks down, sis1 gets renumbered as sis0, wreaking havoc
(mostly on my brains until I figure out which card is actually affected).

How do I tie down these two interfaces so that they always stay as sis0
and sis1, respectively, regardless of which ones are present in the
system? - I expect to insert something into /boot/device.hints.

-- Martin

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: amd + NFS reconnect = ICMP storm + unkillable process.

2011-08-27 Thread Martin Birgmeier

Thank you for these patches.

One interesting thing: I was trying to backport them to 7.4.0 and 
RELENG_7, too, but there the portion of the code dealing with the 
RPC_CANTSEND case does not exist. On the other hand, the problem 
surfaced (for me) when upgrading from 7.4 to 8.2. So could one probably 
conclude that it is more the write case which leads to the erroneous 
behavior?


Regards,

Martin

On 08/26/11 21:19, Artem Belevich wrote:

On Fri, Aug 26, 2011 at 12:04 PM, Rick Macklemrmack...@uoguelph.ca  wrote:

The patch looks good to me. The only thing is that *maybe* it should
also do the same for the other msleep() higher up in clnt_dg_call()?
(It seems to me that if this msleep() were to return ERESTART, the same
  kernel loop would occur.)

Here's this variant of the patch (I'll let you decide which to commit).

Good work tracking this down, rick

--- rpc/clnt_dg.c.sav   2011-08-26 14:44:27.0 -0400
+++ rpc/clnt_dg.c   2011-08-26 14:48:07.0 -0400
@@ -467,7 +467,10 @@ send_again:
cu-cu_waitflag, rpccwnd, 0);
if (error) {
errp-re_errno = error;
-   errp-re_status = stat = RPC_CANTSEND;
+   if (error == EINTR || error == ERESTART)
+   errp-re_status = stat = RPC_INTR;
+   else
+   errp-re_status = stat = RPC_CANTSEND;
goto out;
}
}

You're right. I'll add the change to the commit.

--Artem


@@ -636,7 +639,7 @@ get_reply:
 */
if (error != EWOULDBLOCK) {
errp-re_errno = error;
-   if (error == EINTR)
+   if (error == EINTR || error == ERESTART)
errp-re_status = stat = RPC_INTR;
else
errp-re_status = stat = RPC_CANTRECV;



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Re: amd + NFS reconnect = ICMP storm + unkillable process.

2011-07-06 Thread Martin Birgmeier

Hi Artem,

I have exactly the same problem as you are describing below, also with quite
a number of amd mounts.

In addition to the scenario you describe, another way this happens here
is when downloading a file via firefox to a directory currently open in
dolphin (KDE file manager). This will almost surely trigger the symptoms
you describe.

I've had 7.4 running on the box before, now with 8.2 this has started to 
happen.


Alas, I don't have a solution.

We should probably file a PR, but I don't even know where to assign it to.
Amd does not seem much maintained, it's probably using some old-style
mounts (it never mounts anything via IPv6, for example).

Regards,

Martin

 Hi,

 I wonder if someone else ran into this issue before and, maybe, have 
a solution.


 I've been running into a problem where access to filesystems mouted
 with amd wedges processes in an unkillable state and produces ICMP
 storm on loopback interface.I've managed to narrow down to NFS
 reconnect, but that's when I ran out of ideas.

 Usually the problem happens when I abort a parallel build job in an
 i386 jail on FreeBSD-8/amd64 (r223055). When the build job is killed
 now and then I end up with one process consuming 100% of CPU time on
 one of the cores. At the same time I get a lot of messages on the
 console saying Limiting icmp unreach response from 49837 to 200
 packets/sec and the loopback traffic goes way up.

 As far as I can tell here's what's happening:

 * My setup uses a lot of filesystems mounted by amd.
 * amd itself pretends to be an NFS server running on the localhost and
 serving requests for amd mounts.
 * Now and then amd seems to change the ports it uses. Beats me why.
 * the problem seems to happen when some process is about to access amd
 mountpoint when amd instance disappears from the port it used to
 listen on. In my case it does correlate with interrupted builds, but I
 have no clue why.
 * NFS client detects disconnect and tries to reconnect using the same
 destination port.
 * That generates ICMP response as port is unreachable and it reconnect
 call returns almost immediatelly.
 * We try to reconnect again, and again, and again
 * the process in this state is unkillable

 Here's what the stack of the 'stuck' process looks like in those rare
 moments when it gets to sleep:
 18779 100511 collect2 -mi_switch+0x176
 turnstile_wait+0x1cb _mtx_lock_sleep+0xe1 sleepq_catch_signals+0x386
 sleepq_timedwait_sig+0x19 _sleep+0x1b1 clnt_dg_call+0x7e6
 clnt_reconnect_call+0x12e nfs_request+0x212 nfs_getattr+0x2e4
 VOP_GETATTR_APV+0x44 nfs_bioread+0x42a VOP_READLINK_APV+0x4a
 namei+0x4f9 kern_statat_vnhook+0x92 kern_statat+0x15
 freebsd32_stat+0x2e syscallenter+0x23d

 * Usually some timeout expires in few minutes, the process dies, ICMP
 storm stops and the system is usable again.
 * On occasion the process is stuck forever and I have to reboot the box.

 I'm not sure who's to blame here.

 Is the automounter at fault for disappearing from the port it was
 supposed to listen to?
 If NFS guilty of trying blindly to reconnect on the same port and not
 giving up sooner?
 Should I flog the operator (ALA myself) for misconfiguring something
 (what?) in amd or NFS?

 More importantly -- how do I fix it?
 Any suggestions on fixing/debugging this issue?

 --Artem
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Vote in favor of keeping ATM (was: NATM still scheduled for removal - please follow up to keep it in-tree)

2011-02-19 Thread Martin Birgmeier

Hi,

I would like to vote in favor of keeping NATM in the kernel. The reason 
is that I am currently working on writing a device driver for the 
SpeedTouch USB modem, to replace ports/net/pppoa which is not supported 
on FreeBSD8+ any more. The SpeedTouch USB in fact terminates as an ATM 
connection, on top of which PPPoA needs to be layered.


From my experiments with compiling a kernel with device atm and 
options NATM, I know that currently only the former works, this being 
due to unmaintained and broken code dealing with routing entries.


I am currently not much of a kernel code expert, but have already 
managed to write enough of the USB side of the device driver to load the 
modem's firmware. The next step would be to connect it to the ATM stack, 
using this route:


1. Terminate as ATM interface (ATM cells arriving);
2. The ATM stack implements AAL5 (I hope);
3. Capture the interface via ng_atm (which, as far as I understand, 
would more aptly be named ng_natm);
4. Extend the functionality of ng_atmllc (which basically does a 
small subset of RFC2684) to also do LLC/ISO (cf. RFC2364) (then better 
named ng_llc);

5. Couple the resultant PPP stream to ng_ppp;
6. Use something to configure the VPI/VCI (what?);
7. Run ports/net/mpd5 on that netgraph node.

5. and 7. could be replaced by ng_tty and ppp(8), but that would be the 
poorer choice as all traffic would have to go through userland again as 
it is doing with ports/net/pppoa.


For this I'd need a) a working ATM stack and b) the help of some kind 
souls in hooking everything up. Hans-Petter Selasky has already been 
very helpful with the USB part in private mail, and I actually wanted to 
solicit more help on the networking side of things privately in order 
not to trumpet out something which I'll probably finish only after 
considerable time, but reading the removal message I felt that I needed 
to make my needs public.


Regards,

Martin

p.s. A few :-) of the questions I have are

- why the original (as I understand HARP) ATM stack was removed (in the 
CVS logs the reason cited is the usual giant lock issue of that time),


- what the differences between the atm and natm stacks are (as I 
understand the latter only supports a subset of the functionality of the 
former - only AAL5?),


- why AF_NATM is different from AF_ATM (hinting that NATM is not a 
replacement of ATM),


- whether and how it is even possible to inject raw ATM cells,

- whether I even need options NATM (currently I can happily 
instantiate a (of course non-functional) ATM interface using just 
device atm),


- what do I need to do on the USB side to start receive and transmit 
machines (do I need to start separate kernel threads or just issue two 
usbd_transfer_setup() calls as for loading the firmware),


- etc. etc.

I do of course read the source, but with the scarce documentation 
available that's a steep learning curve.


p.p.s. Message re-sent from freebsd-atm because up till now I was not 
subscribed to freebsd-net.


--
Martin Birgmeier
Vienna
Austria
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org


Locking in ng_tty.c

2011-02-19 Thread Martin Birgmeier

In ng_tty.c, function ngt_newhook(), there is the following code:

if (sc-hook)
return (EISCONN);

NGTLOCK(sc);
sc-hook = hook;
NGTUNLOCK(sc);

I do not think this is proper - should not the test be within the lock?

Regards,

Martin

--
Martin Birgmeier
Vienna
Austria
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org