Improving quality of item similarities?

2013-02-14 Thread Julian Ortega
Hi everyone. I have a data set that looks like this: Number of users: 198651 Number of items: 9972 Statistics of purchases from users mean number of purchases 3.3 stdDev number of purchases 3.5 min number of purchases 1 max number of purchases 176 median number

Re: Improving quality of item similarities?

2013-02-14 Thread Sean Owen
Yes, I don't know if removing that data would improve results. It might mean you can compute things faster, at little or no observable loss in quality of the results. I'm not sure, but you probably have repeat purchases of the same item, and items of different value. Working in that data may help