This problem is much more commonly referred to as the cold start problem and is far smaller than many authors assume. Typically a dozen good interactions is plenty to get good recommendation performance and half a dozen suffices to do pretty well.
Obviously if you are using ratings then most of your audience will never give you that much data. If you use implicit data then you are likely to get that much data in the first few minutes of use and you can accelerate even that's with good ui design. There is still a small cold start problem even if it is much smaller than some assume. Typically, this can be dealt with using a combination of an anonymous or semi-anonymous model. Both are supported in mahout. Sent from my iPhone On Apr 2, 2012, at 4:49 PM, ziad kamel <ziad.kame...@gmail.com> wrote: > CF suffers from the data sparsity problem, where users only rate a > small set of items. That makes the computation of similarity between > users imprecise and consequently reduces the accuracy of CF > algorithms. > http://www.jucs.org/jucs_17_4/a_clustering_approach_for > > > > On Sun, Apr 1, 2012 at 1:20 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: >> Could you say a bit more about what you mean? Which data sparsity problem? >> >> Sent from my iPhone >> >> On Apr 1, 2012, at 6:35 AM, ziad kamel <ziad.kame...@gmail.com> wrote: >> >>> Hi, >>> >>> Is there any ways that mahout CF can overcome the data sparsity problem? >>> >>> Thanks