It will be subscribed shortly. Otis will be off the mod list as well. Sean is tuning the moderators list.
On Fri, Mar 29, 2013 at 4:47 PM, Dan Filimon <dangeorge.fili...@gmail.com>wrote: > Oops. Sorry about that! > The issue seems to be that ReviewBoard automatically CCs the dev list but > it's apparently not subscribed. > > > On Fri, Mar 29, 2013 at 6:36 PM, Otis Gospodnetic < > otis.gospodne...@gmail.com> wrote: > > > FYI, I'm getting a lot of these (and not moderating any more due to lack > of > > time) > > > > Otis > > -- > > Solr & ElasticSearch Support > > http://sematext.com/ > > > > > > > > > > > > ---------- Forwarded message ---------- > > From: < > dev-reject-1364573050.63309.haimnphidmmapikej...@mahout.apache.org> > > Date: Fri, Mar 29, 2013 at 12:04 PM > > Subject: MODERATE for dev@mahout.apache.org > > To: > > Cc: dev-allow-tc.1364573050.abpdchciinoejcdfjbch-noreply= > > reviews.apache....@mahout.apache.org > > > > > > > > To approve: > > dev-accept-1364573050.63309.haimnphidmmapikej...@mahout.apache.org > > To reject: > > dev-reject-1364573050.63309.haimnphidmmapikej...@mahout.apache.org > > To give a reason to reject: > > %%% Start comment > > %%% End comment > > > > > > > > ---------- Forwarded message ---------- > > From: "Dan Filimon" <dangeorge.fili...@gmail.com> > > To: "Sebastian Schelter" <s...@apache.org>, "Ted Dunning" < > > tdunn...@apache.org> > > Cc: "Dan Filimon" <dangeorge.fili...@gmail.com>, "mahout" < > > dev@mahout.apache.org> > > Date: Fri, 29 Mar 2013 16:04:08 -0000 > > Subject: Re: Review Request: MAHOUT-1181: Adds StreamingKMeans MapReduce > > classes > > This is an automatically generated e-mail. To reply, visit: > > https://reviews.apache.org/r/10193/ > > > > On March 29th, 2013, 1:48 p.m., *Sebastian Schelter* wrote: > > > > > > > > > core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansDriver.java< > > https://reviews.apache.org/r/10193/diff/1/?file=276345#file276345line203 > > > > (Diff > > revision 1) > > > > None > > > > {'text': ' private void configureOptionsForWorkers() throws > > ClassNotFoundException, IllegalAccessException,', 'line': 175} > > > > 203 > > > > log.info("No measure class given, using > EuclideanDistanceMeasure"); > > > > Why not make euclidean distance the default value of the distance > > measure option? > > > > I forgot to do that myself because the option is in > > DefaultOptionCreator. Fortunately, the default set there, > > SquaredEuclideanDistance is a great default, probably better than > > EuclideanDistance. So, I just removed this chunk of code entirely. > > > > > > On March 29th, 2013, 1:48 p.m., *Sebastian Schelter* wrote: > > > > > > > > > core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansDriver.java< > > https://reviews.apache.org/r/10193/diff/1/?file=276345#file276345line309 > > > > (Diff > > revision 1) > > > > None > > > > {'text': ' private void configureOptionsForWorkers() throws > > ClassNotFoundException, IllegalAccessException,', 'line': 175} > > > > 309 > > > > log.error("Measure class not found " + measureClass, e); > > > > program should throw an exception and terminate if the distance > > measure class cannot be found, right? > > > > Indeed. I removed the try/catch. > > > > > > On March 29th, 2013, 1:48 p.m., *Sebastian Schelter* wrote: > > > > > > > > > core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansDriver.java< > > https://reviews.apache.org/r/10193/diff/1/?file=276345#file276345line315 > > > > (Diff > > revision 1) > > > > None > > > > {'text': ' private void configureOptionsForWorkers() throws > > ClassNotFoundException, IllegalAccessException,', 'line': 175} > > > > 315 > > > > log.error("Searcher class not found " + measureClass, e); > > > > program should throw an exception and terminate if the searcher > > class cannot be found, right? > > > > Yep, same as above. > > > > > > - Dan > > > > On March 29th, 2013, 4:03 p.m., Dan Filimon wrote: > > Review request for mahout, Ted Dunning and Sebastian Schelter. > > By Dan Filimon. > > > > *Updated March 29, 2013, 4:03 p.m.* > > Description > > > > This depends (loosely) on https://reviews.apache.org/r/10194/ > > > > This patch implements the MapReduce version of StreamingKMeans for > > MAHOUT-1154. > > > > It adds 5 new classes: > > - CentroidWritable: class representing a centroid that can be written > > to a SeqFile > > - StreamingKMeansDriver: class implementing AbstractJob that is the > > entry point to the mapreduction > > - StreamingKMeansMapper: mapper, running StreamingKMeans (see > > MAHOUT-1162) clustering the points one by one > > - StreamingKMeansReducer: reducer, running BallKMeans (see > > MAHOUT-1162) a number of times and picking the clustering with the > > lowest total clustering cost. > > The cost is determined by randomly splitting the incoming centroids > > into a "training" and "test" set, computing the centroids on the > > training set and the cost on the test set. The intent is to see > > whether the centroids actually describe the distribution of the points > > or not. > > - StreamingKMeansUtilMR: helper class with a method to instantiate a > > searcher from a Configuration. > > > > Additionally, there is a test class StreamingKMeansTestMR that tests > > the mapper, reducer and mapper and reducer together using MRUnit. > > > > !!! > > Since MRUnit is now a dependency, the core pom.xml file adds MRUnit as > > a dependency. We depend on snapshot 1.0 which is not yet released (it > > will be very soon), hence the updated pom.xml is not provided for now. > > !!! > > > > Testing > > > > See StreamingKMeansTestMR for the tests. These are all performed on > > data sample from a "hypercube" distribution (there are multinormal > > distributions in each vertex of the cube). > > Additionally there are ongoing tests on the 20 newsgroups data set > > (and some more are on the way). > > > > Diffs > > > > - > > > core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/CentroidWritable.java > > (PRE-CREATION) > > - > > > core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansDriver.java > > (PRE-CREATION) > > - > > > core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansMapper.java > > (PRE-CREATION) > > - > > > core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansReducer.java > > (PRE-CREATION) > > - > > > core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansUtilsMR.java > > (PRE-CREATION) > > - > > > core/src/test/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansTestMR.java > > (PRE-CREATION) > > - src/conf/driver.classes.default.props (ac45eef) > > > > View Diff <https://reviews.apache.org/r/10193/diff/> > > >