Le mer. 10 févr. 2021 à 13:19, sebb <seb...@gmail.com> a écrit :
>
> Likewise, commons-ml is too cryptic.
>
> Also, the Spark project has a machine-learning library:
>
> https://spark.apache.org/mllib/

Thanks for the pointer.

>
> Maybe that would be better home?

On the face of it, probably.
[For sure, Avijit should comment on the suggestion.]

On the other hand, "Commons" is the place where one can pick "bare
bone" implementations, and add the functionality to one's application
without necessarily comply with an overarching framework.
[I don't mean that framework compliance is bad; quite the contrary, it is
hopefully the result of a thorough reflection by experts.  But ... cf. the
numerous "no-dependency" discussions ...]

Actually, concerning Avijit's proposed contribution, didn't I say:[1]
---CUT---
Thus, I think that we must assess whether the "genetic algorithms"
functionality has a reasonable future within "Apache Commons" (i.e.
potential users and contributors) while there exist other libraries that
seem much more advanced for any serious usage.
---CUT---

> I'm also a bit concerned as to whether there are sufficient developers
> here with knowledge of the ML domain to be able to support the code in
> the future.

An interesting point; by all means not a new one (see e.g. [2]).

Isn't it the same point I've been making about "Commons Math" (CM)?
There has been no releases because nobody here is able (or is willing
to) support it.

Concerning the support of the purported "machinelearning" component:
1. Package
        org.apache.commons.math4.ml.neuralnet
    * I've written it entirely and I have applications that depend on it (and I
      cannot assume that I could easily switch to, or port it to, Spark), so I
      can reasonably ensure that it would be supported.
2. Package
        org.apache.commons.math4.ml.clustering
    * Functionality is mentioned in Spark's "mllib" user guide.
    * When a new feature was last contributed[3], it was noticed[4][5][6]
      that improvement were needed (but there was no follow-up).
    * I've an application that depend on it (from CM v3.6.1) but I wouldn't
      support it if shipped in CM v4.0.
3. Package
        org.apache.commons.math4.genetics
    * Part of my "end-of-study" project consisted in a GA implementation.
      I've never used the CM implementation, and I don't deny that there
      could be perfectly fine uses of it but, just looking at the code, it seems
      obvious that it cannot compete feature-wise with other libraries
out there.
    * I've suggested long ago that, without anyone supporting it actively (and
      no known user community), it should be dropped from CM.
    * Avijit expressed a willingness to improve the functionality:  Is
this enough
      for the PMC to create a new component?  From the experience with the
      "clustering" package mentioned above, I'd tend to think (unfortunately)
      that it isn't.  He should first explore whether the Spark community is
      interested, that the GA functionality be moved over there.

Gilles

[1] https://issues.apache.org/jira/browse/MATH-1563
[2] https://markmail.org/message/26yxj5vhysdsoety
[3] https://issues.apache.org/jira/projects/MATH/issues/MATH-1509
[4] https://issues.apache.org/jira/projects/MATH/issues/MATH-1524
[5] https://issues.apache.org/jira/projects/MATH/issues/MATH-1528
[6] https://issues.apache.org/jira/projects/MATH/issues/MATH-1526

>
> On Wed, 10 Feb 2021 at 08:27, Emmanuel Bourg <ebo...@apache.org> wrote:
> >
> > -1 for commons-ml for the same reasons.
> >
> > What about commons-machine-learning or commons-math-learning? The latter
> > is as long as commons-configuration.
> >
> > Emmanuel Bourg
> >
> >
> > Le 2021-02-10 03:27, Ralph Goers a écrit :
> > > -1 on commons-ml as the name. My first thought is such a repo would
> > > hold stuff related to mailing lists. Then again maybe it contains
> > > stuff relating to markup languages. Maybe it is Apache’s version of
> > > the ML Programming Language [1].
> > >
> > > However, I wouldn’t be -1 on commons-math-ml, although at best I would
> > > be +0 since it is still not obvious what it would contain.
> > >
> > > Ralph

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to