On Tuesday, July 22, 2014 4:21:49 AM UTC-5, Viral Shah wrote:
Wow, better than scikit.learn? This is exciting.
We are not there yet. However, the work to unify our many packages for
regression has already been started. If we keep our paces this won't be a
too-far-away goal.
We should probably discuss in the roadmap issue about what infrastructure
we need to support large-scale distributed machine learning problems.
-viral
On Monday, July 21, 2014 4:08:14 AM UTC+5:30, Dahua Lin wrote:
Please see https://github.com/JuliaStats/MLBase.jl/blob/master/NEWS.md
for recent updates.
Also the documentation is moved from Readme to a Sphinx doc
http://mlbasejl.readthedocs.org/en/latest/
Now we already have quite a few packages for various machine learning
tasks:
MLBase.jl https://github.com/JuliaStats/MLBase.jl: data preprocessing,
performance evaluation, cross validation, model tuning, etc
Distance.jl https://github.com/JuliaStats/Distance.jl: metric/distance
computation (including batch pairwise computation)
MultivariateStats.jl https://github.com/JuliaStats/MultivariateStats.jl:
multivariate analysis, ridge regression, dimensionality reduction
Clustering.jl https://github.com/JuliaStats/Clustering.jl: K-means,
K-medoids, Affinity propagation
NMF.jl https://github.com/JuliaStats/NMF.jl: Nonnegative matrix
factorization
In addition, we have a bunch of other packages for Regression, GLM, SVM,
etc. We are now beginning to unite the efforts in this domain (see the
discussion https://github.com/JuliaStats/Roadmap.jl/issues/14 here).
We have been making steady progress, and I believe that we will have a
great machine learning ecosystem, one that is comparable or even superior
to scikit.learn in not too long future.
Cheers,
Dahua