Re: [Scikit-learn-general] MiniBatchKmeans crashes

2013-11-24 Thread Olivier Grisel
Thanks for the reproduction case. Could you please open a new issue on github? -- Olivier -- Shape the Mobile Experience: Free Subscription Software experts and developers: Be at the forefront of tech innovation. Intel(R

Re: [Scikit-learn-general] MiniBatchKmeans crashes

2013-11-24 Thread Douwe Kiela
On Sun, Nov 24, 2013 at 4:22 PM, Olivier Grisel wrote: > Could you please try to pull the branch from this pull request and > check that it fixes your issue: > > https://github.com/scikit-learn/scikit-learn/pull/2355 I'm afraid it doesn't. With that PR branch I get the same error. Here is some c

Re: [Scikit-learn-general] PR about topic models

2013-11-24 Thread chyi-kwei yau
Got it. I will ask him if he can relicense the code to BSD or MIT. On Sun, Nov 24, 2013 at 12:06 PM, Gael Varoquaux wrote: > On Sun, Nov 24, 2013 at 12:05:07PM -0500, chyi-kwei yau wrote: >> Sure, that's a possibility. There's already a version of the code in >> the gensim project under the LGP

Re: [Scikit-learn-general] PR about topic models

2013-11-24 Thread Gael Varoquaux
On Sun, Nov 24, 2013 at 12:05:07PM -0500, chyi-kwei yau wrote: > Sure, that's a possibility. There's already a version of the code in > the gensim project under the LGPL license—let me know if that's still > to restrictive and I'll send you a more BSD-friendly version. The guy is really cool. That

Re: [Scikit-learn-general] PR about topic models

2013-11-24 Thread chyi-kwei yau
Hi guys, I just got response from Matt Hoffman about relicense his onlineLDA code. Here is his response: -- Hi Chyi-Kwei, Sure, that's a possibility. There's already a version of the code in the gensim project under the LGPL license—let me know if that's still to restrictive and I'll

Re: [Scikit-learn-general] MiniBatchKmeans crashes

2013-11-24 Thread Olivier Grisel
Hi Douwe, Could you please try to pull the branch from this pull request and check that it fixes your issue: https://github.com/scikit-learn/scikit-learn/pull/2355 Best, -- Olivier Grisel -- Shape the Mobile Experienc

[Scikit-learn-general] Cleaning/feature extraction of e-mail messages

2013-11-24 Thread Florian Lindner
Hello, I want to use scikit-lean for mail classification (no spam detection). I haven't really worked with machine learning software (besides end-user spamfilters). What I have done so far: vectorizer = TfidfVectorizer(input='filename', preprocessor=mail_preprocessor, decode_error="ignore") X