Re: Clustering options
Actually, I was confused between MLlib and Samsara, thanks Pat. I was already reading the documentation of MLlib. Cheers. > On May 24, 2016, at 10:48, Pat Ferrelwrote: > > Mahout Samsara is more about rolling your own algo, though it has already > implemented several as examples. If you want to build your own clustering you > will find a lot of what you need in the R-like DSL. > > But if you want something already built you may want to look at Spark’s MLlib > kmeans. > > People often ask; what is the difference between Mahout and MLlib? MLlib is a > collection of algos, Mahout is an optimized tensor math engine with many > extensions and several algos. You can’t do the matrix A’B in MLlib because > it’s not an algo, it’s a bit of math—a very useful bit. > > > On May 23, 2016, at 8:10 PM, FRANCISCO XAVIER SUMBA TORAL > wrote: > > Hi Dmitriy, > > Thanks for your clarification. > > Cheers. > > >> On May 23, 2016, at 12:00, Dmitriy Lyubimov wrote: >> >> Xavier, >> there are no exact equivalents in public domain to algorithms existed for >> MR clustering as of yet. My understanding some of them are on the roadmap >> though. >> >> depending on the level of sophistication you require, some of them are very >> easy to build though. >> >> On Sat, May 21, 2016 at 8:46 PM, FRANCISCO XAVIER SUMBA TORAL < >> xavier.sumb...@ucuenca.ec> wrote: >> >>> Hi, >>> >>> Since clustering algorithms are deprecated in mahout samsara. How can I >>> make use of mahout to run a clustering algorithm. Basically, I use mahout >>> to cluster paper's keywords. I take a bunch of keywords and I cluster them >>> to find groups of related keywords. How can I update my code to mahout >>> samsara any suggestion? >>> >>> Cheers >>> > >
Re: Clustering options
Mahout Samsara is more about rolling your own algo, though it has already implemented several as examples. If you want to build your own clustering you will find a lot of what you need in the R-like DSL. But if you want something already built you may want to look at Spark’s MLlib kmeans. People often ask; what is the difference between Mahout and MLlib? MLlib is a collection of algos, Mahout is an optimized tensor math engine with many extensions and several algos. You can’t do the matrix A’B in MLlib because it’s not an algo, it’s a bit of math—a very useful bit. On May 23, 2016, at 8:10 PM, FRANCISCO XAVIER SUMBA TORALwrote: Hi Dmitriy, Thanks for your clarification. Cheers. > On May 23, 2016, at 12:00, Dmitriy Lyubimov wrote: > > Xavier, > there are no exact equivalents in public domain to algorithms existed for > MR clustering as of yet. My understanding some of them are on the roadmap > though. > > depending on the level of sophistication you require, some of them are very > easy to build though. > > On Sat, May 21, 2016 at 8:46 PM, FRANCISCO XAVIER SUMBA TORAL < > xavier.sumb...@ucuenca.ec> wrote: > >> Hi, >> >> Since clustering algorithms are deprecated in mahout samsara. How can I >> make use of mahout to run a clustering algorithm. Basically, I use mahout >> to cluster paper's keywords. I take a bunch of keywords and I cluster them >> to find groups of related keywords. How can I update my code to mahout >> samsara any suggestion? >> >> Cheers >>
Re: Clustering options
Hi Dmitriy, Thanks for your clarification. Cheers. > On May 23, 2016, at 12:00, Dmitriy Lyubimovwrote: > > Xavier, > there are no exact equivalents in public domain to algorithms existed for > MR clustering as of yet. My understanding some of them are on the roadmap > though. > > depending on the level of sophistication you require, some of them are very > easy to build though. > > On Sat, May 21, 2016 at 8:46 PM, FRANCISCO XAVIER SUMBA TORAL < > xavier.sumb...@ucuenca.ec> wrote: > >> Hi, >> >> Since clustering algorithms are deprecated in mahout samsara. How can I >> make use of mahout to run a clustering algorithm. Basically, I use mahout >> to cluster paper's keywords. I take a bunch of keywords and I cluster them >> to find groups of related keywords. How can I update my code to mahout >> samsara any suggestion? >> >> Cheers >>
Re: Clustering options
Xavier, there are no exact equivalents in public domain to algorithms existed for MR clustering as of yet. My understanding some of them are on the roadmap though. depending on the level of sophistication you require, some of them are very easy to build though. On Sat, May 21, 2016 at 8:46 PM, FRANCISCO XAVIER SUMBA TORAL < xavier.sumb...@ucuenca.ec> wrote: > Hi, > > Since clustering algorithms are deprecated in mahout samsara. How can I > make use of mahout to run a clustering algorithm. Basically, I use mahout > to cluster paper's keywords. I take a bunch of keywords and I cluster them > to find groups of related keywords. How can I update my code to mahout > samsara any suggestion? > > Cheers >
Clustering options
Hi, Since clustering algorithms are deprecated in mahout samsara. How can I make use of mahout to run a clustering algorithm. Basically, I use mahout to cluster paper's keywords. I take a bunch of keywords and I cluster them to find groups of related keywords. How can I update my code to mahout samsara any suggestion? Cheers
Kmeans clustering options
Hi all, I just got the latest update to the first 6 chapters of Mahout In Action, and it still says that '-r' is an option to k-means clustering. I'm working with 0.4, and I'm not seeing it as an option off of -h. I'm just looking for a sanity check - can you set the number of reducers for k-means? Thanks, Kate
RE: Kmeans clustering options
Only using a -Dmapred.reduce.tasks=n parameter. The explicit CLI argument was dropped in 0.4. Looks like the book has a typo. -Original Message- From: moving...@gmail.com [mailto:moving...@gmail.com] On Behalf Of Kate Ericson Sent: Wednesday, April 06, 2011 5:45 PM To: user@mahout.apache.org Subject: Kmeans clustering options Hi all, I just got the latest update to the first 6 chapters of Mahout In Action, and it still says that '-r' is an option to k-means clustering. I'm working with 0.4, and I'm not seeing it as an option off of -h. I'm just looking for a sanity check - can you set the number of reducers for k-means? Thanks, Kate
Re: Kmeans clustering options
Thanks for the quick reply! -Kate On Wed, Apr 6, 2011 at 6:58 PM, Jeff Eastman jeast...@narus.com wrote: Only using a -Dmapred.reduce.tasks=n parameter. The explicit CLI argument was dropped in 0.4. Looks like the book has a typo. -Original Message- From: moving...@gmail.com [mailto:moving...@gmail.com] On Behalf Of Kate Ericson Sent: Wednesday, April 06, 2011 5:45 PM To: user@mahout.apache.org Subject: Kmeans clustering options Hi all, I just got the latest update to the first 6 chapters of Mahout In Action, and it still says that '-r' is an option to k-means clustering. I'm working with 0.4, and I'm not seeing it as an option off of -h. I'm just looking for a sanity check - can you set the number of reducers for k-means? Thanks, Kate