On Fri, Dec 2, 2011 at 6:07 PM, Daniel Zohar <[email protected]> wrote:
> > I definitely agree that the correctness should not be broken. My solution > is not meant to decrease the number of possible items like you stated in > your example. It was meant to reduce the amount of item-user associations > (while preserving user-item associations) which will results much less > effort on intersectionSize(). Even in the case that we have two popular > My point is that intersectionSize() is called as part of a similarity computation. Yes, that's the bottleneck. But, that happens after the stage where candidate items are identified. And you are talking about changing the candidate identification stage, which is not the bottleneck. I think your change *happens* to also reduce the number of similarity computations since it assumes some are 0, when they are not! sure that saves time, in the same way that you'll finish an exam faster if you don't answer half the questions. I am instead suggesting to optimize intersectionSize(), such that for all of these 1-item cases, the answer is computed extremely fast. Which also addresses the bottleneck of course. I suppose this could be proven or disproven quickly -- do you get the same speed up with the change I committed, without your change? if you do, great, we have a solution. If not then I am wrong and you have some example that pinpoints where the new bottleneck is.
