Hi Mike, Yes, I can, and have, done such a plot. As you noted, there is strong correlation between the rank of p1 and p2/p1 for each case (error and no-error). There is also correlation between p1 and p2. But the extent of correlation between p1 and p2 (or other quantities derived from them) is beside the point.
The winner of this game is the one who can find a way to sort the symbols into “probably good” and “probably bad” buckets with the highest accuracy. With my two plots, I’ve attempted to show that I have a way to sort the error and no-error cases with minimal overlap. That’s why one plot concentrates in the upper right, and the other concentrates in the lower left. We want to sort the symbols this way so that we know which ones we should tell the Berlekamp-Massey decoder to “ignore” when it tries to decode the symbol vector. We’re calling these “erasure” candidates. In the algorithm that we’re trying to tune, the “probably bad” symbols will be marked as erasures in most of our attempts to decode. Steve k9an > > On Sep 26, 2015, at 4:16 PM, Michael Black <mdblac...@gmail.com> wrote: > > Can you do a plot of p1 vs p2 and show one plot with errors and one with > correct? > Since you have p1 in both axes you automatically get a correlation between > them. > > And could you post a link to download the data you have with > p1/p2/error/correct? > > Thanks > Mike W9MDB > > On Sat, Sep 26, 2015 at 3:49 PM, Steven Franke <s.j.fra...@icloud.com> wrote: > Joe - > > Just FYI - two plots attached. These are 2D histograms of the symbol rank and > p2/p1 ratio for all symbols associated with successfully decoded files from > my batch of 1000 JTSim files with SNR=-24dB. Note that each of these > quantities (rank, p2/p1) is invariant under any multiplicative scaling factor > applied to both p1 and p2. > > x-axis is the symbol's p1-rank. Lowest rank corresponds to highest p1. > > y-axis is p2/p1. > > The first plot, with yellow concentrated in the upper right is for symbols > that were in error. The second plot, with yellow in the lower left is for > symbols that are correct. > > Using the insight provided by these plots, I now have 709/1000 decodes for > the batch of 1000 JTSim files at -24dB. This is using WSJT10, sfrsd, SFM and > ntrials=10000. The erasure proabilities are assigned .by carving up the plane > into 4 regions, and assigning the highest erasure probability to the upper > right. That’s still without using the mr2sym’s. > > Getting closer! As time permits, I’ll see if I can reproduce these results > using your sfrsd2 and rsdtest. > > Steve k9an > <error_symbols.png><correct_symbols.png> >> On Sep 26, 2015, at 10:27 AM, Joe Taylor <j...@princeton.edu> wrote: >> >> Hi Steve, >> >> On 9/26/2015 10:40 AM, Steven Franke wrote: >>> Don’t worry about it Joe - I tried scaling up my metrics by a factor >>> of 4 and then kvasd gave me 644 decodes. So it’s clear that kvasd >>> wants bigger numbers. For now, I’ll just focus on trying optimize sfrsd… >>> >>> Your rsdtest looks like a good way to do quick runs to test sfrsd. >>> At your leisure, can you point me to the s3_1000.bin file? >> >> With SVN revision 5929 I have remade the s3_1000.bin file into "stream" >> format, which will make it easier to read in C. I posted a copy of the >> file at >> http://www.physics.princeton.edu/pulsar/K1JT/s3_1000.bin >> >>> I’m playing with “R” to do statistics on metrics. So far, it looks >>> like the rank of p1 (index of p1 in a sorted list) and the ratio >>> p2/p1 are the most powerful statistics for identifying symbols that >>> are likely to be in error - at least for the SFM. >> >> Great! I am looking at similar things, but in a different (largely >> qualitative) way. Will be interesting to see where this takes us. >> >>> Do you remember how you came up with the JTM? It looks like you >>> use exp(x) to expand the distribution of the power associated with >>> each symbol, and then normalize to the total exponentially-expanded >>> power. Was this chosen empirically? Or based on a certain model >>> for a fading channel? >> >> I think Ralf Koetter suggested the exp(x) to me; I think I also found it >> in a Viterbi book. The detailed parameter choices were purely >> empirical. We're not working with averaged transmissions yet, but the >> dependence on "nadd" is intended to allow for different statistics of >> elements of s3(i,j) when synchronized transmissions have been averaged. >> >> -- Joe >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> wsjt-devel mailing list >> wsjt-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/wsjt-devel > > > ------------------------------------------------------------------------------ > > _______________________________________________ > wsjt-devel mailing list > wsjt-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/wsjt-devel > > > ------------------------------------------------------------------------------ > _______________________________________________ > wsjt-devel mailing list > wsjt-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/wsjt-devel ------------------------------------------------------------------------------ _______________________________________________ wsjt-devel mailing list wsjt-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/wsjt-devel