Hi,
I read this function in the source code of taste, and I have some
questions on the algorithm of similarity calculation:
public double itemSimilarity(long itemID1, long itemID2) throws
TasteException {
int preferring1and2 = dataModel.getNumUsersWithPreferenceFor(itemID1,
itemID2);
if (preferring1and2 == 0) {
return Double.NaN;
}
int preferring1 = dataModel.getNumUsersWithPreferenceFor(itemID1);
int preferring2 = dataModel.getNumUsersWithPreferenceFor(itemID2);
int numUsers = dataModel.getNumUsers();
double logLikelihood =
twoLogLambda(preferring1and2, preferring1 - preferring1and2,
preferring2, numUsers - preferring2);
return 1.0 - 1.0 / (1.0 + logLikelihood);
}
static double twoLogLambda(double k1, double k2, double n1, double n2) {
double p = (k1 + k2) / (n1 + n2);
return 2.0 * (logL(k1 / n1, k1, n1) + logL(k2 / n2, k2, n2) - logL(p,
k1, n1) - logL(p, k2, n2));
}
Is there any academic paper on this function? Why we should calculate the
similarity by the upon formula?
Thanks!