2013/10/9 Peter Prettenhofer :
> great - thanks Lars - will prepare a PR
I just realized that I forgot to benchmark the sparse case as well.
There, having a C-ordered RHS can still give a speed boost:
>>> X = fetch_20newsgroups_vectorized().data
>>> Y = np.random.randn(X.shape[1], 20)
>>> %timeit
The six implementation used in this code comes from the
sklearn.externals package, not the system six package: it should be
independent from any other installed six package.
Maybe it's a bug in the Python version. What is the minor version?
--
Olivier
---
Peter implemented "penalized SVD" with SGD for "Netflix
challenge"-style matrix factorization problems:
http://code.google.com/p/pyrsvd/
It should be a pretty good baseline to compare performance against.
As for missing data, I would just use scipy.sparse matrices and treat
non-materialized zero
Thanks Kyle, this does seem to be some sort of incompatibility with the "six"
module on our Python distribution on our MacOS 10.8.
CentOS 6.3 and sklearn 0.15-git works, but not CentOS 5.2 with the same.
For now I will just use a custom build until I can figure out exactly what the
problem is.
I replied on the issue tracker:
https://github.com/scikit-learn/scikit-learn/issues/2498
I am running the current master under OSX 10.8 as well and I cannot
reproduce either. Nor can the continuous integration (jenkins or
travis, both are green).
--
Olivier
The basic algorithms are not too bad - I have implemented PMF and KPMF
(probabilistic matrix factorization and kernelized PMF) in Python fairly
recently. Non-optimized, but the runtime was still not terrible - one form
of PMF is basically a tweaked stochastic gradient descent.
I talked (very brief
Ryan,
Default CentOS 5.2 is python 2.4-ish right? Did you install a separate
Python distribution?
I have had a lot of trouble getting Anaconda to integrate with the system
packages in CentOS - maybe this is something similar? No idea about the OSX
side of things, though.
Kyle
On Wed, Oct 9, 201
Currently, the go-to solutions for prototyping recommender systems are
MyMediaLite and Graphchi. Both of which are command line tools implemented in
C# and C++. It would be useful to have tools for prototyping recommender
systems in a python environment. I'm sure that many would support it.
Three developers are working on MacOS 10.8 and CentOS 5.2 and all three of us
have the same issue on Python 2.7.
In other words, it's not just me.
Sent from my iPhone
On Oct 9, 2013, at 6:53 AM, "Jaques Grobler"
mailto:jaquesgrob...@gmail.com>> wrote:
I agree with Olivier -
On the latest dev
great - thanks Lars - will prepare a PR
2013/10/9 Lars Buitinck
> 2013/10/8 Peter Prettenhofer :
> > that's a bug - I'll open a ticket for it.
> > A quick fix: call partial_fit instead of fit just before the ``for``
> loop.
>
> Peter, is this due to an optimization that turns coef_ into a
> For
2013/10/8 Peter Prettenhofer :
> that's a bug - I'll open a ticket for it.
> A quick fix: call partial_fit instead of fit just before the ``for`` loop.
Peter, is this due to an optimization that turns coef_ into a
Fortran-ordered array? If so, I don't think we need it any longer with
NumPy 1.7 and
I agree with Olivier -
On the latest dev version for me, partial_fit is there
I am unable to reproduce this problem
-
>>> model = naive_bayes.MultinomialNB()
>>> model.partial_fit(X, Y, classes = [0, 1])
>>> dir(model)
['__abstractmethods__', '__class__', '__delattr__',
12 matches
Mail list logo