yes, it is. Does mahout has examples or similar example to do this: read the gzip file, construct page set, form vectors for each user, then run as rabbit
On Fri, Nov 9, 2012 at 11:47 AM, Sean Owen <sro...@gmail.com> wrote: > That's a clustering problem, no? > > > On Fri, Nov 9, 2012 at 4:43 PM, qiaoresearcher <qiaoresearc...@gmail.com > >wrote: > > > It is a supervised classification problem. > > > > For example, a very simple case: > > say, overall we collect 4 pages from the data set: { web_page 1 > web_page > > 2 web_page 3 web_page 4 } > > then users may have input vectors like: > > user1 [1 1 0 0] > > user2 [1 1 0 0] > > user3 [0 0 1 1] > > user4 [0 0 1 1] > > user5 [0 0 1 1] > > ... .... > > > > then whatever classification algorithm mahout has should return > > classification results as > > group 1 { user1, user2} > > group 2 { user3, user4, user5 } > > >