Hi.

Le sam. 22 févr. 2020 à 14:37, CT <chentao...@qq.com> a écrit :
>
> Hi Gilles:
>   I really appricate for your patient to help me to solve the mail sending 
> problem, I try to set the only setting about charset "Use Unicode" for this 
> mail.

The contents seems fine.  However, there is something wrong: replying to your
message sends to you instead of the mailing list...

>   I have created a pull request: 
> https://github.com/apache/commons-math/pull/118

That PR fails the build:
   
https://travis-ci.org/apache/commons-math/builds/653293451?utm_source=github_status&utm_medium=notification

Also, the code doesn't take into account all the remarks which I
made in previous comments (e.g. file JIRA reports to allow tracking
of changes to existing code and not mix those with new code).

In addition, you should avoid mixing massive cosmetic changes (e.g.
Javadoc reformatting) within a commit than contains other types of
change:
    
https://github.com/apache/commons-math/pull/118/commits/47a055e6264d084547854f9290461f020f2131cf

>   And a comparsion between KMeans and MiniBatchKMeans by using the 
> "org.apache.commons.math4.userguide.ClusterAlgorithmComparison":

Functionality seems to work fine, but ability to review incremental
changes is extremely important in a collaborative project.

Thanks,
Gilles


>
> Hello.
>
> Le mar. 18 févr. 2020 à 04:49, 陈 涛 <chentao...@hotmail.com> a écrit :
> >
> > Hi Gilles:
> >
> >    I really do not know if anyone received my last mail, no one replay me 
> > for a long time so I send it again and copy to you with another email 
> > system.
>
> Sorry for the delay. :-}
>
> >
> > > Some remarks:
> >
> > > * I didn't get why the "KMeansPlusPlusCentroidInitializer" class
> > > does not call the existing "KMeansPlusPlusClusterer".
> > > Code seems duplicated: As there is a case for reuse, the currently
> > > "private" centroid initialization code should be factored out.
> >
> > This is alpha version for discuss the "MiniBatchKMeansClusterer" algorithm,
>
> I guess that you mean that we discuss your implementation of the
> algorithm referenced in the Javadoc.
>
> > and when "centroidOf" is extracted from "KMeansPlusPlusClusterer",
> >
> > the "KMeansPlusPlusClusterer" is not "KMeansPlusPlusClusterer" anymore but 
> > "KMeansClusterer",
>
> I don't follow.
>
> >
> > this is a significant change, so I did not reactor it.
>
> Significant changes are welcome (since the next release will contain
> other major changes anyways) if they improve the code base (like e.g.
> reducing code duplication).
>
> >
> >
> >
> > > * In "CentroidInitializer", I'd rename "chooseCentroids" to emphasize
> > > that a computation is performed (as opposed to selecting from an
> > > existing data structure).
>
> I think I'd prefer "selectCentroids".
>
> >
> > It is extract from "KMeansPlusPlusClusterer.centroidOf", should remain be 
> > "centroidOf"?
>
> I don't understand.
>
> It would be easier if you create a pull request, so that we can clearly see
> what codes are added/removed/changed.
>
> >
> > The subclass "RandomCentroidInitializer" and 
> > "KMeansPlusPlusCentroidInitializer" indicate the algorithm used.
> >
> >
> > > * Not convinced that there should be so many constructors (in most
> > > cases, it makes no sense to pick default values that are likely to
> > > be heavily application-dependent.
> >
> > I can add more constructors.
>
> No, the constructors with default values clutter the API, for
> no obvious gain IMHO.
> [If the default values make sense, they must be documented.]
>
> >
> > I'd like a builder class more than constructors, but does not meet the 
> > historical code style.
>
> Now is the time for improving the API.
> It would be quite helpful to create a report on JIRA with "sub-tasks"
> for all such API proposed changes.
>
> > > * Input should generally be validated: e.g. the maximum number of
> > > iterations should not be changed unwittingly; rather, an exception
> > > should be raised if the user passed a negative value.
> >
> > Thanks for your advices, I will improve these.
> >
> > > Could be nice to illustrate (not just with a picture, but in a table
> > > with entries average over several runs) the differences in result
> > > between the implementations, using various setups (number of
> > > clusters, stopping criterion, etc.).
> >
> > I will make more tests, include benchmarks.
> >
> > It is a challenge for me to generate the various kinds of test data,
> >
> > could anybody supply me the test data of this comparsion: 
> > http://commons.apache.org/proper/commons-math/userguide/ml.html
>
> They are generated programmatically; code is here:
>     
> https://gitbox.apache.org/repos/asf?p=commons-math.git;a=tree;f=src/userguide/java/org/apache/commons/math4/userguide
>
> [I've just updated the codes so that they compiles and run using
> the new dependencies (see the "README" file).]
>
> > > "MT_64" is probably not the best default.  And this is one of the
> > > parameters from which there should not be a default IMO.
> >
> > I will do more tests
>
> You don't need to test the generators; users should choose
> by themselves (from those provided in "Commons RNG").
>
> >
> > > [Note: there are spurious characters in your message (see e.g. the
> > > paragraph quoted just above) that make it difficult to read.]
> >
> > I had well format my mail in my mail box, it may been changed by the Mail 
> > List service.
> >
> > I will try various kinds of mail editor. It will helpful if you told me 
> > which mail editor is work well with the ML.
>
> It's probably an encoding thing (setting it to "UTF-8" should be
> fine).
>
> Best regards,
> Gilles
>
> > > [...]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to