Posts on the 'WiFi-connection-unstable'
(https://forums.slimdevices.com/showthread.php?109953-WiFi-connection-unstable-lost-on-three-Radios&p=1011730&viewfull=1#post1011730)
thread are being addressed in a revision to the -wlanpoke- software,
currently 0.6.4 (= 0.6.3 with unpublished '-x' bugfix). The new 0.7.0
software (in testing) has some improvements besides the '-x' bugfix: 
    
- An attempt to quickly restore a connection after a shorter number of
  failures before fully resetting the wireless.
- A report added to the logs consisting of a vector of the number of
  failed pings prior to recovery.
- New command line switches to assign values to variables to replace
  the constants 'Quick Failure Limit,' 'Reset Failure Limit,' and
  'Seconds between Tests'.
  
The 0.7.0 software has been running on the three worst radios here for
half a day using the original default values for the above constants,
plus a 'Quick' limit of 2 less (4) than the 'Reset' limit (6). One radio
had no failures. The other two had several failures. Some were recovered
by the quick method (re-association in this test), most by the current
full reset method. Here are the log file excerpts, annotated and edited
for brevity:

Code:
--------------------
    =18 quick 
  2021-03-13T06:07:05-0500 MBR.23_070 failed 2021-03-13T06:06:47-0500 quick 
2021-03-13T06:06:57-0500 reset up 2021-03-13T06:06:57-0500 ... Bit Rate=54 Mb/s 
... Link Quality:39/94 Signal level:-56 dBm ... time=4.651 ms
  11009 119 2 0 40 1 27 12
  
  =44 full
  2021-03-13T08:18:45-0500 MBR.23_070 failed 2021-03-13T08:18:01-0500 quick 
2021-03-13T08:18:11-0500 reset 2021-03-13T08:18:21-0500 up 
2021-03-13T08:18:42-0500 ... Bit Rate=54 Mb/s ... Link Quality:42/94 Signal 
level:-53 dBm ... time=25.699 ms
  13790 192 5 0 48 1 30 17
  
  =19 quick
  2021-03-13T08:49:02-0500 LRUE.25_070 failed 2021-03-13T08:48:43-0500 quick 
2021-03-13T08:48:53-0500 reset up 2021-03-13T08:48:53-0500 ... Bit Rate=54 Mb/s 
... Link Quality:39/94 Signal level:-56 dBm ... time=10.708 ms   
  15590 23 0 0 6 0 3 3
  
  =40 full
  2021-03-13T08:49:55-0500 LRUE.25_070 failed 2021-03-13T08:49:15-0500 quick 
2021-03-13T08:49:25-0500 reset 2021-03-13T08:49:34-0500 up 
2021-03-13T08:49:52-0500 ... Bit Rate=48 Mb/s ... Link Quality:39/94 Signal 
level:-56 dBm ... time=5.119 ms 
  15593 24 0 0 7 0 3 4
  
--------------------

(Take note of the atrocious ping times: 4.651 ms, 25.699 ms, 10.708 ms,
and 5.119 ms. This is the heart of the problem, the radio not reliably
receiving packets from the access point, requiring numerous
re-transmissions.)

The first line of each of these four incidents shows the seconds taken
between the first ping failure and the first subsequent ping success.
The second 'line' is an abbreviation of the failure summary report, with
excess information (...) removed. The third line is a vector of the
number of successful pings indexed by the count of prior failed pings.
Again, in this test, the time between pings is 2 seconds, the quick and
full failure limits are 4 and 6 respectively, none of which are
currently reported in the log.

By far, the pings succeed with zero intervening failures, so the 'zero'
numbers are large. One radio had its own recovery with 24 one-ping
failures and no two-ping failures. The other had 192 and 5 'one' and
'two' failures before it managed to recover on its own. Apparently,
radios suffering 3 ping failures did not recover on their own. When the
quick limit [4] or full limit [6] was reached, the software recorded
this, even though there may not have been a recovery. This was done to
see whether the quick method worked or not. If the quick method worked,
there would not be an increment to the full counter. The [7] slot
indicates a failed ping after the full reset prior to recovery, there
are likely more at higher counts, but they were not recorded by the test
version. Interestingly, there was one recovery in the [5] slot,
indicating a delayed recovery after the quick reset method, plus other
so far unexplained anomalies (bugs?) left for later. The time between
ping test was not reduced from 2 to 1 second, this is for another test.
More frequent pings might have a secondary salutary effect of improving
the wireless connection somehow. 

The score to date is: "13790 192 5 0 48 1 30 17 " and "15593 24 0 0 7 0
3 4". The quick method worked 18 times out of 48, and 4 times out of 7
in two radios. There were, so far, no radios that recovered on their own
after 3 failed pings. This suggests that the quick method might be
applied after a much earlier failed ping, and perhaps for each failed
ping until a full reset. 

The new script requires some more refinement and testing. "Too Bad" my
environment here has improved in recent days, and there are fewer
failures to mitigate. This will make testing slower. If you are
interesting in testing this, leave a message. 

BTW, fiddling with the device driver and supporting software has been
resumed, and might yield far better results. Stay tuned.


------------------------------------------------------------------------
POMdev's Profile: http://forums.slimdevices.com/member.php?userid=70558
View this thread: http://forums.slimdevices.com/showthread.php?t=111663

_______________________________________________
Radio mailing list
Radio@lists.slimdevices.com
http://lists.slimdevices.com/mailman/listinfo/radio

Reply via email to