Re: Clustering without Hadoop

Shan Lu Sun, 01 Dec 2013 21:17:57 -0800

Thanks, Ted. I went through some introductions of Ball k-means and
streaming k-means, but still not clear how to implement the algorithm
without hadoop. Do you know any hello world example code using non-Hadoop
version streaming k-means?  Thanks.



On Sun, Dec 1, 2013 at 11:12 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> The new Ball k-means and streaming k-means implementations have non-Hadoop
> versions.  The streaming k-means implementation also has a threaded
> implementation that runs without Hadoop.
>
> The threaded streaming k-means implementation should be pretty fast.
>
>
>
> On Sun, Dec 1, 2013 at 7:55 PM, Shan Lu <shanlu...@gmail.com> wrote:
>
> > Thanks, Suneel, I'll try this way.
> >
> > In this recommender example:
> >
> >
> https://github.com/ManuelB/facebook-recommender-demo/blob/master/src/main/java/de/apaxo/bedcon/AnimalFoodRecommender.java#L142
> > ,
> >
> > they only use mahout api. So I am thinking that can I do the clustering
> > similarly.
> >
> >
> > On Sun, Dec 1, 2013 at 10:42 PM, Suneel Marthi <suneel_mar...@yahoo.com
> > >wrote:
> >
> > > Shan,
> > >
> > > All of Mahout implementations use Hadoop API, but if u r trying to run
> > > kmeans in sequential (non-MapReduce) mode; pass in  runSequential =
> true
> > > instead of false as the last parameter to KMeansDriver.run() or Amit
> run
> > > them in LOCAL_MODE as pointed out earlier by Amit.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Sunday, December 1, 2013 10:28 PM, Shan Lu <shanlu...@gmail.com>
> > wrote:
> > >
> > > Thanks for your reply. In the example code, they run the k-means
> > algorithm
> > > using org.apache.hadoop.conf.Configuration,
> > > org.apache.hadoop.fs.FileSystem, and org.apache.hadoop.fs.Path
> > parameters.
> > > Is there any algorithm that doesn't need any Configuration and Path
> > > parameter, just use the data in memory? I mean, can I  run the k-means
> > > algorithm without using the hadoop api, just using java? Thanks.
> > >
> > >
> > > On Sun, Dec 1, 2013 at 9:58 PM, Amit Nithian <anith...@gmail.com>
> wrote:
> > >
> > > > When you say without hadoop does that include local mode? You can run
> > > these
> > > > examples in local mode that doesn't require a cluster for testing and
> > > > poking around. Everything then runs in a single jvm.
> > > > On Dec 1, 2013 9:18 PM, "Shan Lu" <shanlu...@gmail.com> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I am working on a very simple k-means clustering example. Is there
> a
> > > way
> > > > to
> > > > > run clustering algorithms in mahout without using Hadoop? I am
> > reading
> > > > the
> > > > > book "Mahout in Action". In chapter 7, the hello world clustering
> > code
> > > > > example, they use
> > > > > ==
> > > > >
> > > > > KMeansDriver.run(conf, new Path("testdata/points"), new
> > > > > Path("testdata/clusters"),
> > > > >       new Path("output"), new EuclideanDistanceMeasure(), 0.001,
> 10,
> > > > >       true, false);
> > > > >
> > > > > ==
> > > > > to run the k-means algorithm. How can I run the k-means algorithm
> > > without
> > > > > Hadoop?
> > > > >
> > > > > Thanks!
> > >
> > > > >
> > > > > Shan
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Shan Lu
> > > ECE Dept., NEU, Boston, MA 02115
> > >
> >
> >
> >
> > --
> > Shan Lu
> > ECE Dept., NEU, Boston, MA 02115
> >
>



-- 
Shan Lu
ECE Dept., NEU, Boston, MA 02115

Re: Clustering without Hadoop

Reply via email to