date:20130512

Re: [Scikit-learn-general] Out of memory when running silhouette score function

2013-05-12 Thread Gael Varoquaux

On Sun, May 12, 2013 at 01:35:07PM +0200, Alexandre ABRAHAM wrote: > I know that the first purpose of scikit is not to handle big data but > would you be interested by a PR of my silhouette block implementation ? +1 for PR. I think that I would introduce a keyword argument to switch between the 2

Re: [Scikit-learn-general] Out of memory when running silhouette score function

2013-05-12 Thread Matthieu Brucher

Hi, I guess Theano is a big dependency. I for one do not consider GPU ready for heavy numerical processes. Those that are _massively_ data parellel may be parallelized, but task parallelism is madly suited for GPUs. And the way Alexandre parallized the computation is more task- than data-paralleli

Re: [Scikit-learn-general] Out of memory when running silhouette score function

2013-05-12 Thread Ronnie Ghose

theano for the parallelization? from what i understand your PR uses on-the-fly computation to reduce memory usage vs all at once. Wouldn't Theano help? As in could you per chance 'theano-ize' the parallel calculation maybe? I consider heavy numerical processes to be (at least now) mostly the doma

Re: [Scikit-learn-general] multiprocessing error

2013-05-12 Thread Andreas Mueller

Hi Matthias. Unfortunately joblib doesn't handle large datasets very gracefully at the moment. Have you tried setting the pre_dispatch parameter? Otherwise it could be that all jobs are dispatched even if only two are run. Hth, Andy On 05/12/2013 05:12 PM, Matthias Ekman wrote: Dear all, us

Re: [Scikit-learn-general] Out of memory when running silhouette score function

2013-05-12 Thread Alexandre ABRAHAM

Hi Ronnie, I have never used Theano, could you be a little more specific ? What do you want to compute ? What is your input data ? Basically, all these metrics are independant of the scikit and take numpy arrays as input so you can use it with any data under this format. Now, if you want to integ

[Scikit-learn-general] multiprocessing error

2013-05-12 Thread Matthias Ekman

Dear all, using sklearn 0.13 (fresh Ubuntu 12.04 installation), I'm getting the error below, which I belief is a memory error. What strikes me is that I'm using a machine with 512GB of RAM, so that shouldn't be happening. Is there maybe a system setting that limits the amount of RAM on a user bas

Re: [Scikit-learn-general] Out of memory when running silhouette score function

2013-05-12 Thread Ronnie Ghose

uhhh +1. any chance of using theano with it? On Sun, May 12, 2013 at 7:35 AM, Alexandre ABRAHAM < abraham.alexan...@gmail.com> wrote: > Hey scikit people, > > I know that the first purpose of scikit is not to handle big data but > would you be interested by a PR of my silhouette block implementa

Re: [Scikit-learn-general] Out of memory when running silhouette score function

2013-05-12 Thread Alexandre ABRAHAM

Hey scikit people, I know that the first purpose of scikit is not to handle big data but would you be interested by a PR of my silhouette block implementation ? My benches have shown that it is a bit slower than the scikit one when data is small but it divides memory usage by n_cluster ^ 2. Plus i

Re: [Scikit-learn-general] Out of memory when running silhouette score function

Re: [Scikit-learn-general] Out of memory when running silhouette score function

Re: [Scikit-learn-general] Out of memory when running silhouette score function

Re: [Scikit-learn-general] multiprocessing error

Re: [Scikit-learn-general] Out of memory when running silhouette score function

[Scikit-learn-general] multiprocessing error

Re: [Scikit-learn-general] Out of memory when running silhouette score function

Re: [Scikit-learn-general] Out of memory when running silhouette score function

8 matches

Site Navigation

Mail list logo

Footer information