Thank you for the guidance.

I will try building something rough and ask questions if i run into any
errors.




On Sat, Nov 29, 2014 at 10:38 PM, Pat Ferrel <p...@occamsmachete.com> wrote:

> The Mahout site is a good starting point for using any of the recommenders.
>
> http://mahout.apache.org/users/recommender/intro-itembased-hadoop.html
>
> On Nov 29, 2014, at 1:33 PM, Yash Patel <yashpatel1...@gmail.com> wrote:
>
> Can you give me some more details on the Hadoop mapreduce item-based
> cooccurrence recommender.
>
>
> Best Regards,
> Yash Patel
>
> On Fri, Nov 28, 2014 at 7:21 PM, Pat Ferrel <p...@occamsmachete.com> wrote:
>
> > I built this app with it: https://guide.finderbots.com
> >
> > The app uses MongoDB, Ruby on Rails, and Solr 4.3. Once the model comes
> > out of the job it is csv text—therefore language and architecture
> neutral.
> > I load the data from spark-itemsimilarity into MongoDB using java. Solr
> is
> > set up for full-text indexing and queries using data from MongoDB. The
> > queries are made to Solr through REST from Ruby UX code. You can replace
> > any component in this stack with whatever you wish and use whatever
> > language you are comfortable with.
> >
> > Alternatively you could modify the UI of Solr or Elasticsearch—both are
> in
> > Java.
> >
> > If you use any of the other Mahout recommenders they create all recs for
> > all known users so you’ll still need to build a way to serve those
> results.
> > People often use DBs for this and integrate with their web app framework.
> >
> > On Nov 28, 2014, at 10:03 AM, Yash Patel <yashpatel1...@gmail.com>
> wrote:
> >
> > I looked up spark row similarity but i am not sure if it will suit my
> needs
> > as i want to build my recommender as a java application possibly with an
> > interface.
> >
> >
> > On Fri, Nov 28, 2014 at 5:43 PM, Pat Ferrel <p...@occamsmachete.com>
> wrote:
> >
> >> Some references:
> >>
> >> small free book here, which talks about the general idea:
> >> https://www.mapr.com/practical-machine-learning
> >> preso, which talks about mixing actions or other indicators:
> >>
> >
> http://occamsmachete.com/ml/2014/10/07/creating-a-unified-recommender-with-mahout-and-a-search-engine/
> >> two blog posts:
> >>
> >
> http://occamsmachete.com/ml/2014/08/11/mahout-on-spark-whats-new-in-recommenders/
> >>
> >
> http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/
> >> mahout docs:
> >>
> http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html
> >>
> >> Build Mahout from this source: https://github.com/apache/mahout This
> > will
> >> run stand-alone on a dev machine, then if your data is too big for a
> > single
> >> machine you can run it on a Spark + Hadoop cluster. The data this
> creates
> >> can be put into a DB or indexed directly by a search engine (Solr or
> >> Elasticsearch). Choose the search engine you want then queries of a
> > user’s
> >> item id history will go there--results will be an ordered list of item
> > ids
> >> to recommend.
> >>
> >> The core piece is the command line job: “mahout spark-itemsimilarity”,
> >> which can parse csv data. The options specify what columns are used for
> > ids.
> >>
> >> Start out simple by looking only at user and item IDs. Then you can add
> >> other cross-cooccurrence indicators for multiple actions later pretty
> >> easily.
> >>
> >>
> >> On Nov 28, 2014, at 12:14 AM, Yash Patel <yashpatel1...@gmail.com>
> > wrote:
> >>
> >> The mahout + search engine recommender seems what would be best for the
> >> data i have.
> >>
> >> Kindly get back to me at your earliest convenience.
> >>
> >>
> >>
> >> Best Regards,
> >> Yash Patel
> >>
> >> On Thu, Nov 27, 2014 at 9:58 PM, Pat Ferrel <p...@occamsmachete.com>
> > wrote:
> >>
> >>> Mahout has several recommenders so no need to create one from
> > components.
> >>> They all make use of the similarity of preferences between users—that’s
> >> why
> >>> they are in the category of collaborative filtering.
> >>>
> >>> Primary Mahout Recommenders:
> >>> 1) Hadoop mapreduce item-based cooccurrence recommender. Creates all
> > recs
> >>> for all users. Uses “Mahout IDs"
> >>> 2) ALS-WR hadoop mapreduce, uses matrix factorization to reduce noise
> in
> >>> the data. Sometimes better for small data sets than #1. Uses “Mahout
> > IDs"
> >>> 3) Mahout + search engine: cooccurrence type. Extremely flexible, works
> >>> with multiple actions (multi-modal), works for new users that have some
> >>> history, has a scalable server (from the search engine) but is more
> >>> difficult to integrate than #1 or #2. Uses your own ids and reads csv
> >> files.
> >>>
> >>> The rest of the data seems to apply either to the user or the item and
> > so
> >>> would be used in different ways. #1 an #2 can only use user id and item
> >> id
> >>> but some post recommendation weighting or filtering can be applied. #3
> >> can
> >>> use multiple attributes in different ways. For instance if category is
> > an
> >>> item attribute you can create two actions, user-pref-for-an-item, and
> >>> user-pref-for-a-category. Assuming you want to recommend an item (not
> >>> category) you can create a cross-ccoccurrence indicator for the second
> >>> action and use the data to make your item recs better. #3 is the only
> >>> methods that supports this.
> >>>
> >>> Pick a recommender and we can help more with data prep.
> >>>
> >>>
> >>> On Nov 26, 2014, at 1:34 PM, Yash Patel <yashpatel1...@gmail.com>
> > wrote:
> >>>
> >>> Hello everyone,
> >>>
> >>> wow i am quite happy to see so many inputs from people.
> >>>
> >>> I apologize for not providing more details.
> >>>
> >>> Although this is not my complete dataset the fields i have chosen to
> use
> >>> are:
> >>>
> >>> customer id - numeric
> >>> item id - text
> >>> postal code - text
> >>> item category ´- text
> >>> potential growth - text
> >>> territory - text
> >>>
> >>>
> >>> Basically i was thinking of finding similar users and recommending them
> >>> items that users like them have bought but they haven't.
> >>>
> >>> Although i would very much like to hear your opinions as i am not so
> >>> familiar with clustering,classifiers etc.
> >>>
> >>> I found that mahout takes sequence files converted into vectors but i
> >>> couldn't understand how would i do it on my data specifically and more
> >>> importantly make a recommender system out of it.
> >>>
> >>> Also i am wondering how to combine the importance of a specific
> customer
> >>> through the potential growth attribute.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Best Regards,
> >>> Yash Patel
> >>>
> >>> On Wed, Nov 26, 2014 at 9:03 PM, Pat Ferrel <p...@occamsmachete.com>
> >> wrote:
> >>>
> >>>> All very good points but note that spark-itemsimilarity may take the
> >>> input
> >>>> directly since you specify column numbers for
> <UID><ITEMID><PREF_VALUE>
> >>>>
> >>>> On Nov 26, 2014, at 11:43 AM, parnab kumar <parnab.2...@gmail.com>
> >>> wrote:
> >>>>
> >>>> kindly elaborate... your requirements... your dataset fields ...and
> > what
> >>>> you want to recommend to an user... Usually a set of item is
> > recommended
> >>> to
> >>>> an user. In your case what are your items ?
> >>>>
> >>>> The standard input is <UID><ITEMID><PREF_VALUE> . Clearly your data is
> >>> not
> >>>> in this format which will let you use directly the algorithms in
> > Mahout.
> >>>>
> >>>> A little more info from your side will help us to give your the right
> >>>> pointers.
> >>>>
> >>>> On Wed, Nov 26, 2014 at 7:16 PM, Yash Patel <yashpatel1...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Dear Mahout Team,
> >>>>>
> >>>>> I am a student new to machine learning and i am trying to build a
> user
> >>>>> based recommender using mahout.
> >>>>>
> >>>>> My dataset is a csv file as an input but it has many fields as text
> > and
> >>> i
> >>>>> understand mahout needs numeric values.
> >>>>>
> >>>>> Can you give me a headstart as to where i should start and what kind
> > of
> >>>>> tools i need to parse the text colummns,
> >>>>>
> >>>>> Also an idea on which classifiers or clustering methods i should use
> >>>> would
> >>>>> be highly appreciated.
> >>>>>
> >>>>>
> >>>>> Best Regards;
> >>>>> Yash Patel
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
> >
>
>

Reply via email to