Hi Gilles,
------------------ Original ------------------
From:&nbsp;"GillesSadowski"<gillese...@gmail.com&gt;;
Date:&nbsp;Wed, Feb 26, 2020 05:41 PM
To:&nbsp;"Commons Developers List"<dev@commons.apache.org&gt;;

Subject:&nbsp;Re: [math] Discuss: New feature MiniBatchKMeansClusterer



&gt;[...]

&gt;&gt; Do you mean this:
&gt;&gt; &amp;nbsp;* For JIRA issue #MATH-1509 Start a PR with 
"MiniBatchKMeansClusterer", but without the "ClusterUtils",
&gt;&gt; despite the duplicate code between "MiniBatchKMeansClusterer" and 
"KMeansPlusPlusClusterer",
&gt;&gt; also with "CentroidInitializer" and test code with in a single commit.
&gt;&gt; &amp;nbsp;* Suggestions like "remove the constructors with default 
parameters" should apply as a new commit of the PR above,
&gt;&gt; and tracking by a subtask of JIRA issue #MATH-1509.
&gt;&gt; &amp;nbsp;* Fire a new JIRA issue for the duplicate code, and start 
another PR with "ClusterUtils" in,
&gt;&gt; and extract duplicate code into "ClusterUtils".
&gt;
&gt;No, you should start with the smallest possible self-contained PR.
&gt;For example, why should we commit a code that defines several
&gt;constructors, while we already know that a second commit should
&gt;remove them?
&gt;
&gt;As you've noticed that some functionality must be factored out of
&gt;"KMeansPlusPlusClusterer", this should be done first as a separate
&gt;JIRA issue. IIUC, you propose "ClusterUtils".&nbsp; By reviewing a
&gt;minimal PR, we should be able to examine whether another
&gt;approach might be better (than a "utility" class) in order to expose
&gt;functionality common to all clusterer algorithms.
&gt;For example, could all "Kmeans" implementations inherit from
&gt;a common base class?

Do you mean I should fire a JIRA issue about reuse&nbsp;"centroidOf" and 
"chooseInitialCenters",
then start a PR and a disscuss about "ClusterUtils"?
And then&nbsp;start the PR of "MiniBatchKMeansClusterer" after all done?

&gt;[...]

Reply via email to