It is a problem -- but should be are. IDs are hashed to 31-bit integers, so the probability of collision is small. However you don't have to have too many items before it's probable that some two have collided. (IIRC, that's about 2 ^ (31/2) ? )
In practice it doesn't hurt much. It just means that data from two different items has been mixed up and treated as if it was all from one item. That's not ideal, but has a tiny overall effect on recommendations. Another practical tip: if your item IDs all fit into an unsigned int already, then the hash function won't mix them up at all as all of them will hash to themselves. 2011/9/20 张玉东 <zhangyud...@vancl.cn>: > I am trouble with this problem, if two itemids are mapped to the same index, > then how to compute the similarity between them? > > >