Hello.

Le mar. 18 juil. 2023 à 17:30, Dimitrios Efthymiou
<efthymiou.dimitri...@gmail.com> a écrit :
>
>  Hello everyone. I am working on the modularisation of
> the legacy ml.clustering package to a new module:
> commons-math-clustering. Some clustering classes
> depend on stat.moment.Variance

In the new modules, there must be no dependency
towards classes in the "legacy" module.

Hopefully, [Statistics] will soon (?) contain a brand-new
"Variance"[1] which the new module can depend on.
In the meantime, there are maybe other issues that
can be tackled.[2]

> and some of
> the ml.distance classes.

I guess that "Variance" and "Distance" are not used for
the same purpose.

> 1--those distances belong to geometry probably and
> not machine learning. Manhattan distance, for example.

For the foreseeable future, [Geometry] will only deal with 1D,
2D, 3D.  I.e. physical space with a low and fixed dimension.
In machine learning, the space is routinely high-dimensional
and the dimension varies from problem to problem.
This must be handled at runtime by the implementation(s).

> 2--should I move the distance package to the new
> clustering module so that they are together or create a new
> commons-math-distance module?

It depends on whether the distance will be useful for more
than just the clustering functionality.

> Or put the distance classes
> in the commons-math-geometry project?

No, for the reason given above.

Regards,
Gilles

[1] https://issues.apache.org/jira/browse/STATISTICS-71
[2] 
https://issues.apache.org/jira/issues/?jql=project%20%3D%20MATH%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20text%20~%20%22cluster%22

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to