Hello everyone,

wow i am quite happy to see so many inputs from people.

I apologize for not providing more details.

Although this is not my complete dataset the fields i have chosen to use
are:

customer id - numeric
item id - text
postal code - text
item category ยด- text
potential growth - text
territory - text


Basically i was thinking of finding similar users and recommending them
items that users like them have bought but they haven't.

Although i would very much like to hear your opinions as i am not so
familiar with clustering,classifiers etc.

I found that mahout takes sequence files converted into vectors but i
couldn't understand how would i do it on my data specifically and more
importantly make a recommender system out of it.

Also i am wondering how to combine the importance of a specific customer
through the potential growth attribute.






Best Regards,
Yash Patel

On Wed, Nov 26, 2014 at 9:03 PM, Pat Ferrel <p...@occamsmachete.com> wrote:

> All very good points but note that spark-itemsimilarity may take the input
> directly since you specify column numbers for <UID><ITEMID><PREF_VALUE>
>
> On Nov 26, 2014, at 11:43 AM, parnab kumar <parnab.2...@gmail.com> wrote:
>
> kindly elaborate... your requirements... your dataset fields ...and what
> you want to recommend to an user... Usually a set of item is recommended to
> an user. In your case what are your items ?
>
> The standard input is <UID><ITEMID><PREF_VALUE> . Clearly your data is not
> in this format which will let you use directly the algorithms in Mahout.
>
> A little more info from your side will help us to give your the right
> pointers.
>
> On Wed, Nov 26, 2014 at 7:16 PM, Yash Patel <yashpatel1...@gmail.com>
> wrote:
>
> > Dear Mahout Team,
> >
> > I am a student new to machine learning and i am trying to build a user
> > based recommender using mahout.
> >
> > My dataset is a csv file as an input but it has many fields as text and i
> > understand mahout needs numeric values.
> >
> > Can you give me a headstart as to where i should start and what kind of
> > tools i need to parse the text colummns,
> >
> > Also an idea on which classifiers or clustering methods i should use
> would
> > be highly appreciated.
> >
> >
> > Best Regards;
> > Yash Patel
> >
>
>

Reply via email to