Re: [wsjt-devel] WSPR decoder: Jelinek vs Fano

Joe Taylor Mon, 03 Aug 2015 06:25:31 -0700

Hi Steve, Edson, and all,

I'm happy with setting bias = 0.42, given that several quite different 
tests show that it results in a rate of false decodes smaller than 1 in 
1000.


As for reporting to WSPRnet: my preference is to report all decodes to 
the database site. It's not so very unusual for a new callsign to turn 
up -- perhaps to be decoded only once, especially if band-hopping is in 
use.  The occasional long distance paths with unusual propagation can be 
especially interesting, even (or especially?) when they are one-off 
events.  It's relatively easy to and apply further sanitizing of data at 
the database site, or elsewhere when any analysis of the data is done.
        -- Joe, K1JT

On 8/1/2015 7:33 PM, Steven Franke wrote:
> Hi Joe,
>
>> On Aug 1, 2015, at 1:02 PM, Joe Taylor<j...@princeton.edu>  wrote:
>>
>> Hi Steve,
>>
>> Visiting family are arriving later than expected, so I found some
>> unexpected time to play with wsprd and wsprd_exp.  To make the KA9Q fano
>> decoder work with your change, I had to increase the size of the node[]
>> structure by one bit:
>
>>
>
> Oops - you are right. Why didn’t I get into trouble? Presumably, there was 
> some slack space after the malloc that allowed me to write outside of the 
> allocation without clobbering anything?
>
>> ##############################################################################
>>
>> pulsar:~/jtsdk/src/wsjtx/lib/wsprd$ svn diff
>> Index: fano.c
>> ===================================================================
>> --- fano.c      (revision 5745)
>> +++ fano.c      (working copy)
>> @@ -105,7 +105,7 @@
>>     unsigned int lsym;
>>     unsigned int i;
>>
>> -  if((nodes = (struct node *)malloc(nbits*sizeof(struct node))) == NULL) {
>> +  if((nodes = (struct node *)malloc((nbits+1)*sizeof(struct node))) ==
>> NULL) {
>>       printf("malloc failed\n");
>>       return 0;
>>     }
>> @@ -172,7 +172,7 @@
>>         }
>>         np[1].gamma = ngamma;                     // Move forward
>>         np[1].encstate = np->encstate<<  1;
>> -      if(++np == lastnode) {
>> +      if(++np == lastnode+1) {
>>          break;                                  // Done!
>>         }
>> ##############################################################################
>>
>>
>> Here are some new results, which essentially confirm your findings.
>> Both programs (wsprd and wsprd_exp) were compiled with the above changes
>> made to fano.c.  Just for fun, I timed each run separately after
>> compilation with clang-3.5 and gcc.
>>
>>     Command                         T   B   G  Time  Time
>>                                               clang   gcc
>> ---------------------------------------------------------------
>> 1. wsprd                          124  0  124  25.1  24.2
>> 2. wsprd_exp                      130  7  123  33.7  32.2
>> 3. wsprd_exp -J -C 5000           125  3  122  33.1  32.3
>> 4. wsprd_exp -z 0.46              122  0  122  34.8  33.3
>> 5. wsprd_exp -z 0.46 -J -C 5000   122  0  122  33.8  33.5
>> 6. wsprd_exp -z 0.5               120  0  120  34.6  33.3
>> 7. wsprd_exp -z 0.5 -J -C 5000    119  0  119  34.0  33.1
>>
>> T = total decodes
>> B = bad decodes
>> G = good decodes
>>
>> Some history concerning fano.c may be informative.  Phil Karn's code had
>>
>>          tail =&nodes[nbits-33];
>>
>> where we now have
>>
>>          tail =&nodes[nbits-31];
>>
>> I made this change a long time ago, for reasons that seemed right to me
>> at the time.  Probably I should also have changed the "Done" condition
>> as you have now done.
>>
>> Why did I need to increase the memory allocation for nodes, while you
>> did not?  Possibly the whole thing is not quite right, yet, though your
>> test on the final metric seems to indicate that it's OK.
>
> I think that it’s working correctly now. At least, it is visiting the same 
> nodes and coming up with the same path metrics as Jelinek. And the Jelinek 
> algorithm is so simple that I’m pretty sure that it’s doing the right thing.
>
> I find the fano.c book-keeping approach confusing (though it makes sense from 
> an efficiency/execution time point of view). Specifically, in fano.c each 
> node entry stores an encoder state that includes up to the i’th bit, but it 
> stores the metric only up through the previous (i-1)th bit.
>
> This quirk is why we need to malloc space for one extra node, because the 
> final metric (for the 81-bit long frame) is actually stored in the 82nd entry 
> of the node[] struct array...
>
> Jelinek works differently - because it never backtracks, it can efficiently 
> store the i’th bit and the metric up through the i’th bit in the same node.
>
>>
>>
>> One more thing escapes me at the moment.  Why does case 2 give different
>> results from case 1?  They both use the corrected fano decoder with
>> maxcycles=10000 and bias=0.42.
>
> wsprd_exp uses a different sync algorithm - basically a nested-grid approach 
> in an attempt to improve the efficiency of the sync search. It also tries to 
> refine the drift estimate. This means that the candidate symbol vectors that 
> are submitted to the decoder are somewhat different in wsprd and wsprd_exp. 
> Up until I saw your files I was convinced that wsprd_exp was the superior 
> sync algorithm. Now, not so much.
>
> I’m happy with how fano.c is behaving now. I intend to make the 
> malloc/lastnode changes unless I hear otherwise.
>
> Perhaps we should turn attention to the philosophical question of where to 
> set the bias. I ran some simulations to study how bias affects the 
> probability of decode and probability of bad decode under controlled 
> conditions.
>
> First I ran tests using wsprsim-generated files with -31 dB snr to test the 
> effect of the change in the fano decoder. For each simulation run I generated 
> and decoded 10000 .c2 files. Since all simulated files have the message “K9AN 
> EN50 33”, it was easy to find and count all frames with errors, not just the 
> ones with a forward slash. Here are the results:
>
> wsprd (original fano.c, bias=0.42)
> 5599 decodes, 25 bad decodes (13 bad with “/“), in 1750s (total of 5574 good 
> decodes, 55.7%)
>
> wsprd (modified fano.c, bias=0.42)
> 5578 decodes, 17 bad decodes (4 bad with “/“), in 2225s (total of 5571 good 
> decodes, 55.7%)
>
> This shows that decoding the 81st zero decreased the probability of a bad 
> frame by 33% while having a negligible affect on the probability of 
> successfully decoding a frame.
>
> If, in addition, we raise the bias to 0.46:
>
> wsprd (modified fano.c, bias=0.46)
> 4690 decodes, 8 bad decodes (1 bad with “/“), in 2510s
>
> And with bias=0.50 wsprd (modified fano.c)
> 4174 decodes, 3 bad decodes (1 bad with “/“), in 2669s
>
> In raising the bias from 0.42 to 0.46 we lose more than 900 good decodes just 
> to suppress 9 bad decodes, a 100:1 ratio. From 0.42 to 0.50 we lose more than 
> 1400 to suppress 14 bad ones - still 100:1. Personally, I don’t like to give 
> away so many good decodes.
>
> This brings me back to Edson’s suggestion… Why not run with low bias and just 
> use Edson’s mechanism to filter what gets sent to wsprnet.org?
>
> Steve k9an
>
>
>>
>>      -- Joe
>>
>> On 7/31/2015 11:45 PM, Steven Franke wrote:
>>> Joe,
>>>
>>>> On these files I ran test cases like yours, plus two more, with the
>>>> following results:
>>>>
>>>>     Command                         T   B   G  time
>>>> -----------------------------------------------------------
>>>> 1. wsprd                          125  2  123  25 s
>>>> 2. wsprd_exp                      126 13  113  29
>>>> 3. wsprd_exp -J -C 5000           115  3  112  30
>>>> 4. wsprd_exp -J -C 5000 -z 0.5    112  0  112  30
>>>> 5. wsprd_exp -z 0.5               113  1  112  30
>>>> 6. wsprd5000                      120  0  120  23
>>>> 7. wsprd -z 0.5                   119  0  119  26
>>>>
>>>> T = total decodes
>>>> B = bad decodes
>>>> G = good decodes
>>>>
>>>> Case #6 used a re-compiled wsprd with maxcycles=5000.
>>>>
>>>> On these files, at least, it seems that the best performer is the
>>>> deafult wsprd, Case #1: 123 good decodes, 2 false decodes.  With
>>>> maxcycles=5000 (Case #6) it yields 120 good decodes and 0 false decodes.
>>>>   With bias=0.5 (Case #7) it gives 119 good decodes and 0 false decodes.
>>>>
>>>> Seems to me we should probably go back to bias=0.5, and possibly reduce
>>>> maxcycles a bit.
>>>
>>> I’m on the same page regarding the need to decrease the number of 
>>> bit-errors in the decoded frames. But the thing that has been bothering me 
>>> the most is the vastly different results from Fano and Jelinek when 
>>> presented with the same candidates. All of my background reading suggests 
>>> that these two algorithms should visit the same nodes, given enough cycles. 
>>> And yet, I was unable to get Jelinek to produce the same bad decodes, even 
>>> when I allowed a huge number of cycles and a huge stack size. So I’ve been 
>>> suspicious about a programming error or bug.
>>>
>>> I think that I’ve found at least a partial explanation for the bad behavior 
>>> of Fano. I’ve concluded that fano.c, as configured, is not decoding all 31 
>>> tail bits. It is stopping at 30 bits. I discovered this after printing out 
>>> the final metric for decoded frames. I noticed that on a clean high-snr 
>>> test file, as generated by wsprsim, the final metric for a Fano decode was 
>>> 800, whereas it is 810 for Jelinek. Similarly all of the final metrics 
>>> produced by Fano were smaller than those produced by Jelinek.
>>>
>>> The following change makes the high-snr final metric 810 for Fano, and in 
>>> fact makes all final metrics from Fano and Jelinek identical:
>>>
>>> $ svn diff
>>> Index: wsprd/fano.c
>>> ===================================================================
>>> --- wsprd/fano.c    (revision 5740)
>>> +++ wsprd/fano.c    (working copy)
>>> @@ -172,7 +172,7 @@
>>>         }
>>>         np[1].gamma = ngamma;                     // Move forward
>>>         np[1].encstate = np->encstate<<   1;
>>> -      if(++np == lastnode) {
>>> +      if(++np == lastnode+1) {
>>>     break;                                  // Done!
>>>         }
>>>
>>>
>>> With this change, the results for cases 1 and 2 improve to:
>>> 1.wsprd 124 0 124
>>> 2. wsprd_exp 130 7 123 (compare to your case 2 — it is counter to my 
>>> intuition that we get more good decodes by decoding further into the tail)
>>>
>>> Also, I got different results for your case 3. This shouldn’t be affected 
>>> by the fano.c change, so I wonder if you just mis-copied the numbers on 
>>> this one:
>>>
>>> 3. wsprd_exp -JC 5000 125 3 122
>>>
>>> Finally, note that a metric bias of 0.46 eliminates bad decodes with either 
>>> decoder in wsprd_exp (using the modified fano.c):
>>> wsprd_exp -z 0.46 122 (0) in 19s
>>> wsprd_exp -Jz 0.46 -C 5000 122 (0) in 19s
>>>
>>> So - at this point I am inclined to make the change to fano.c so that we 
>>> decode all 31 tail zeros (provided that you agree with my interpretation of 
>>> what’s going on). I’m also inclined to raise the default bias in wsprd to 
>>> at least 0.46. Or even 0.5 if you prefer, though this may be less necessary 
>>> with the fano fix.
>>>
>>> I also like Edson’s idea. I think that his suggestion could be a win-win by 
>>> essentially eliminating bad decodes in the database and allowing us to run 
>>> a little closer to the ragged edge in terms of what is displayed locally. I 
>>> imagine that spots that are not in ALL_WSPR and therefore are not 
>>> uploadable could be displayed in a different color to make it clear that 
>>> they are only for local consumption.
>>>
>>> Steve k9an
>>>
>>>>
>>>>    -- Joe
>>>>
>>>> ------------------------------------------------------------------------------
>>>> _______________________________________________
>>>> wsjt-devel mailing list
>>>> wsjt-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> wsjt-devel mailing list
>>> wsjt-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> wsjt-devel mailing list
>> wsjt-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/wsjt-devel
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> wsjt-devel mailing list
> wsjt-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wsjt-devel

------------------------------------------------------------------------------
_______________________________________________
wsjt-devel mailing list
wsjt-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wsjt-devel

Re: [wsjt-devel] WSPR decoder: Jelinek vs Fano

Reply via email to