Thanks Ted, I got it working now. I was casting to floats just to make the output easier on the eyes for emailing. I was missing Math.exp(...) and once I had that in, it all started working. It then became obvious the final output was not a rank, but a score that dictated the output/ordering.
I even put that in a separate class and called it TedsJitter (implements Jitter, since I am trying some other jittering approaches). Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Ted Dunning <[email protected]> > To: [email protected] > Sent: Tuesday, June 30, 2009 5:35:36 PM > Subject: Re: Inconsistent recommendations > > Otis, > > There are several substantive problems with your code, mostly due, I am > sure, to my posting R code which is unfamiliar. The most important that I > see off-hand is that the exponential random variable must be defined as: > > - Math.log(1 - Math.random()) > > The idea is that the argument to log must be in the range (0, 1] so that the > result will be in the range [0, inf). The 1-Math.random() is that way > because the range of Math.random() is [0, 1) instead of (0, 1]. > > I have a few style beefs that I hope you will take in good humor as well. > > I will make comments about both of these in-line. > > On Tue, Jun 30, 2009 at 10:27 AM, Otis Gospodnetic < > [email protected]> wrote: > > > > > // exp(-n/5) + rexp() * 0.1 > > for (int i=1; i < 20; i++) { > > > You should use double for all of this code, otherwise your code may be > considerably slower than desired due to float/double conversions and also > since we are doing exp of some potentially good sized numbers, it is very > easy to run out of dynamic range for floats leading to very surprising > results. This is essentially a style question, but I find it to be a very > bad idea to do this kind of premature optimization of floating point > arithmetic. > > float exp = (float) i / 5; // not > > sure why you used -i /n > > > -i/n was used because that will lead to doing exp(negative number). For > large negative numbers, the slope of exp() becomes very flat which makes > large rearrangements possible. Without the negation, the randomization will > have a very different effect. > > float rexp = (float) Math.log(i-Math.random()); // tried with 1 > > instead of i like you said, too > > > The 1 is critical as mentioned above. > > float rank = exp + rexp * 0.1f; > > > I don't see a call to Math.exp anywhere. Perhaps it got lost? That would > probably explain a large part of the problems. Also, this is not the rank, > but rather the synthetic score. Thus this is a misleading name. > > float round = Math.round(rank); > > > I don't think that you want to round like this. Instead, what you should be > doing is accumulating the scores in an array and then sorting the scores. > What I displayed was a permutation that resulted from sorting. > > > > > > System.out.println("EXP: " + exp + "\tREXP: " + rexp + "\RANK: " + > > rank + "\tROUND: " + round); > > } > >
