Re: [OpenWrt-Devel] ath9k (ad-hoc?) problems in compat-wireless trunk as of sept 6/2

2013-09-15 Thread Felix Fietkau
On 2013-09-09 1:29 PM, Nicolás Echániz wrote:
 I have kept testing this and was able to reproduce the same problem
 every time.
 The performance perception when the network is at rush hour is very
 unstable so I tried to reproduce traffic conditions during late night.
 I sent multiple netperfs through different routes and while the netperfs
 were running turned of one node. Latency climbed to the thousands and it
 took a while for it to stabilize again.
 
 Then I turned off another node, latency went sky high again for some
 seconds until I lost conectivity and could monitor no more.
 
 I waited more than 15 minutes for the net to be reachable again from my
 local node. (Nodes are automatically reset when they cannot reach the
 network gateway for more than 15 minutes)
 
 All ping tests are done on fe80 local IPv6 addresses so no routing
 protocol is involved in case someone's wondering. Also, from what I see
 (iw station dump), routers are associated but can send no data.
I've committed some fixes that might be related to this issue. Please
test current OpenWrt trunk.

Thanks,

- Felix
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] ath9k (ad-hoc?) problems in compat-wireless trunk as of sept 6/2

2013-09-09 Thread Nicolás Echániz
I have kept testing this and was able to reproduce the same problem
every time.
The performance perception when the network is at rush hour is very
unstable so I tried to reproduce traffic conditions during late night.
I sent multiple netperfs through different routes and while the netperfs
were running turned of one node. Latency climbed to the thousands and it
took a while for it to stabilize again.

Then I turned off another node, latency went sky high again for some
seconds until I lost conectivity and could monitor no more.

I waited more than 15 minutes for the net to be reachable again from my
local node. (Nodes are automatically reset when they cannot reach the
network gateway for more than 15 minutes)

All ping tests are done on fe80 local IPv6 addresses so no routing
protocol is involved in case someone's wondering. Also, from what I see
(iw station dump), routers are associated but can send no data.

After 15 minutes of waiting, this is what station dump was showing from
my local node:
root@nogal:~# iw wlan0 station dump
Station 64:70:02:3d:85:0a (on wlan0)
inactive time:  40 ms
rx bytes:   175693
rx packets: 2642
tx bytes:   0
tx packets: 0
tx retries: 0
tx failed:  0
signal: -69 dBm
signal avg: -65 dBm
tx bitrate: 1.0 MBit/s
rx bitrate: 1.0 MBit/s
authorized: yes
authenticated:  yes
preamble:   long
WMM/WME:yes
MFP:no


station dump shows no tx data at all...

and of course no ping

root@nogal:~# ping6 fe80::6670:2ff:fe3d:850a%wlan0
PING fe80::6670:2ff:fe3d:850a%wlan0(fe80::6670:2ff:fe3d:850a) 56 data bytes
From fe80::92f6:52ff:febb:ec58 icmp_seq=1 Destination unreachable:
Address unreachable
From fe80::92f6:52ff:febb:ec58 icmp_seq=2 Destination unreachable:
Address unreachable



I'll try to keep this nodes on this version to help debug, but the
network is in production, so I can't do this for long without getting
complaints :)

Cheers.

Nico.


El 07/09/13 22:40, Nicolás Echániz escribió:
 
 We are testing current wireless in QuintanaLibre and we are finding some 
 strange behaviors.
 
 The first symptom we have been able to reproduce is this:
 
 Given nodes A and B which are neighbors with a good link quality (-68 signal, 
 15Mbps throughput) and a third node C which has a comparable link to A and a 
 low signal link to B. When we turn of C's wireless the latency between node A 
 and B will be extremely high (in the thousands of ms) for almost one minute.
 
 In this example, nodes A and B are marisa and nogal. And node C is gerylu 
 (which we turn off 200 sec after the ping starts).
 
 http://pastebin.com/MzWSCDRb
 
 
 the moment when the ping times return to normal is coincidential with the iw 
 event (as seen from marisa -Node A- ):
 
 wlan0 (phy #0): connection quality monitor event: peer 92:f6:52:c6:00:ed 
 didn't ACK 50 packets
 wlan0: del station 92:f6:52:c6:00:ed
 
 92:f6:52:c6:00:ed   -- this is gerylu (node C)
 
 
 We are observing an overall instability in the network which consists of 40 
 nodes (100% ad-hoc) some of which hear up to 16 neighbors in the iw station 
 dump, so this scenario may be occurring frequently.
 
 Lastly, we also observe that some nodes are not associating on one of their 
 interfaces (they are dual band), although they did associate correctly before 
 the upgrade.
 
 
 
 We will keep some nodes with this version for a couple of days in case 
 someone (Felix?) want's to log in and do some live testing.
 
 Guido will add some more info from the tcpdumps we collected if it's relevant.
 
 
 Cheers,
 Nico
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


[OpenWrt-Devel] ath9k (ad-hoc?) problems in compat-wireless trunk as of sept 6/2

2013-09-07 Thread Nicolás Echániz

We are testing current wireless in QuintanaLibre and we are finding some 
strange behaviors.

The first symptom we have been able to reproduce is this:

Given nodes A and B which are neighbors with a good link quality (-68 signal, 
15Mbps throughput) and a third node C which has a comparable link to A and a 
low signal link to B. When we turn of C's wireless the latency between node A 
and B will be extremely high (in the thousands of ms) for almost one minute.

In this example, nodes A and B are marisa and nogal. And node C is gerylu 
(which we turn off 200 sec after the ping starts).

http://pastebin.com/MzWSCDRb


the moment when the ping times return to normal is coincidential with the iw 
event (as seen from marisa -Node A- ):

wlan0 (phy #0): connection quality monitor event: peer 92:f6:52:c6:00:ed didn't 
ACK 50 packets
wlan0: del station 92:f6:52:c6:00:ed

92:f6:52:c6:00:ed   -- this is gerylu (node C)


We are observing an overall instability in the network which consists of 40 
nodes (100% ad-hoc) some of which hear up to 16 neighbors in the iw station 
dump, so this scenario may be occurring frequently.

Lastly, we also observe that some nodes are not associating on one of their 
interfaces (they are dual band), although they did associate correctly before 
the upgrade.



We will keep some nodes with this version for a couple of days in case someone 
(Felix?) want's to log in and do some live testing.

Guido will add some more info from the tcpdumps we collected if it's relevant.


Cheers,
Nico
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel