In case this thread is still a good place to reply with an offer to help, I'd love to pitch in. I have built a few production recommenders, most recently using Mahout at a large retailer along with my partner where we used ALS, with a pipeline of transforming transactions in XML into vectors using Pig, running them through Mahout, and writing to Cassandra. I've also used Hadoop extensively and am experienced writing non-trivial UDF libraries for Pig, so I am comfortable in that world.
Where to start? Do you guys have bugs ranked in priority? Thanks Andrew On Wed, Mar 27, 2013 at 5:01 PM, Daniel Longest <dlong...@gmail.com> wrote: > > Have you used Mahout in some context before? Did you already checkout the > > code base and build it? It usually helps if you are looking into > solving a > > specific problem with the project to find an area that interests you > > personally. > > I worked through the Coursera ML course, which is how I heard about > Mahout. I then read through some of Mahout in Action. I have checked > out the codebase, had difficulty building it on Windows, but I've done > it successfully on Ubuntu tonight so full steam ahead. Not sure I have > a specific interest, seen a lot of good ideas thrown around on this > thread this week. Ted had listed some possible GSOC ideas that I'd be > open to if no students took them on. I see the JIRA has an "intro" > tag, was going to explore some of those a bit to get my feet wet. > > Daniel >