This problem is much more commonly referred to as the cold start problem and is 
far smaller than many authors assume. Typically a dozen good interactions is 
plenty to get good recommendation performance and half a dozen suffices to do 
pretty well. 

Obviously if you are using ratings then most of your audience will never give 
you that much data.  If you use implicit data then you are likely to get that 
much data in the first few minutes of use and you can accelerate even that's 
with good ui design. 

There is still a small cold start problem even if it is much smaller than some 
assume.  Typically, this can be dealt with using a combination of an anonymous 
or semi-anonymous model.  Both are supported in mahout. 

Sent from my iPhone

On Apr 2, 2012, at 4:49 PM, ziad kamel <ziad.kame...@gmail.com> wrote:

> CF suffers from the data sparsity problem, where users only rate a
> small set of items. That makes the computation of similarity between
> users imprecise and consequently reduces the accuracy of CF
> algorithms.
> http://www.jucs.org/jucs_17_4/a_clustering_approach_for
> 
> 
> 
> On Sun, Apr 1, 2012 at 1:20 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>> Could you say a bit more about what you mean?  Which data sparsity problem?
>> 
>> Sent from my iPhone
>> 
>> On Apr 1, 2012, at 6:35 AM, ziad kamel <ziad.kame...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> Is there any ways that mahout CF can overcome the data sparsity problem?
>>> 
>>> Thanks

Reply via email to