Re: [Scikit-learn-general] NIPS's Machine Learning Open Source Software workshop

2013-09-25 Thread Gael Varoquaux
On Mon, Sep 16, 2013 at 09:19:23AM +0200, Gilles Louppe wrote: > So basically, do we agree that the goal of our proposal to this > workshop will only be to further promote the project in the scientific > community? Sounds good to me :) > Besides that, I like the idea of showing examples to highli

[Scikit-learn-general] Fuzzy (K or C) Means algorithm in sklearn?

2013-09-25 Thread Kyle Kastner
I looked around, but was unable to find a fuzzy clustering algorithm in sklearn (something analogous to fcm in Matlab) - I did however find an old gist https://gist.github.com/mblondel/1451300 . Is this in sklearn already in some form and I am just missing it? Kyle ---

[Scikit-learn-general] EMNLP?

2013-09-25 Thread Fred Mailhot
Hi list, Just wondering whether anyone on here in planning on attending EMNLP. I'll be there, and as a heavy user (and hopeful eventual contributor), I'd love to meet with some of you. Fred. -- October Webinars: Code for

Re: [Scikit-learn-general] Right place for a time-series focused algorithm?

2013-09-25 Thread Peter Prettenhofer
Hi Kyle, personally, I'd love to see SAX in sklearn or any other python library that I could easily use with sklearn. We don't have any time-series specific functionality yet (eg. lagged features transformer). So if we choose to add time-series functionality we should also consider the basics. Le

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Lars Buitinck
2013/9/25 Gilles Louppe : > I knew I was an outlier ;) I think it's learned that tree huggers are orthogonal to text mongers, with Andreas being, of course, our top-notch NLP guy. -- October Webinars: Code for Performance

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Gilles Louppe
On 25 September 2013 19:05, Andreas Mueller wrote: > On 09/25/2013 06:44 PM, Olivier Grisel wrote: >> 2013/9/25 Andreas Mueller : >>> On 09/25/2013 04:15 PM, Jacob Vanderplas wrote: Very cool! One quick comment: I'd probably normalize the values in the sparse matrix to 1. As it's w

[Scikit-learn-general] Right place for a time-series focused algorithm?

2013-09-25 Thread Kyle Kastner
I have recently been working with time-series data extensively and looking at different ways to model, classify, and predict different types of time-series. One algorithm I have been playing with is called SAX ( http://www.cs.ucr.edu/~eamonn/SAX.htm). It is a very straightforward algorithm (basica

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Andreas Mueller
On 09/25/2013 06:44 PM, Olivier Grisel wrote: > 2013/9/25 Andreas Mueller : >> On 09/25/2013 04:15 PM, Jacob Vanderplas wrote: >>> Very cool! >>> One quick comment: I'd probably normalize the values in the sparse >>> matrix to 1. As it's written, a user with, say, 1 commit on a file >>> will be co

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Mathieu Blondel
On Wed, Sep 25, 2013 at 6:33 PM, Andreas Mueller wrote: > > soo why is there no PCA embedding and Isomap? ;) > With gravatars would be nice :) -- October Webinars: Code for Performance Free Intel webinars can help you

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Olivier Grisel
2013/9/25 Andreas Mueller : > On 09/25/2013 04:15 PM, Jacob Vanderplas wrote: >> Very cool! >> One quick comment: I'd probably normalize the values in the sparse >> matrix to 1. As it's written, a user with, say, 1 commit on a file >> will be considered a closer neighbor to a user with 0 commits o

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Andreas Mueller
On 09/25/2013 04:15 PM, Jacob Vanderplas wrote: > Very cool! > One quick comment: I'd probably normalize the values in the sparse > matrix to 1. As it's written, a user with, say, 1 commit on a file > will be considered a closer neighbor to a user with 0 commits on that > file than to a user wi

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Gael Varoquaux
We need to do an example with that! G -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Gilles Louppe
Indeed, I think normalization is important, depending on what you want to show. Feel free to play with this if you have good ideas. This is merely a quick proof of concept. Also, I would be curious to apply and visualize our new biclustering algorithms on this. On 25 September 2013 16:15, Jacob V

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Jacob Vanderplas
Very cool! One quick comment: I'd probably normalize the values in the sparse matrix to 1. As it's written, a user with, say, 1 commit on a file will be considered a closer neighbor to a user with 0 commits on that file than to a user with 3 commits on that file. Jake On Wed, Sep 25, 2013 at

Re: [Scikit-learn-general] A question on training models with unbalanced data and testing cases

2013-09-25 Thread Bilal Dadanlar
Hi Ron, The reason why .predict() and .predict_proba doesn't agree is about the method (Plott's scaling) by which probability values are generated. You can have a look at my answer here: http://stackoverflow.com/questions/17017882/scikit-learn-predict-proba-gives-wrong-answers/17142391#17142391 if

Re: [Scikit-learn-general] Multiclass Logistic Regression.

2013-09-25 Thread Luca Cerone
On 25 September 2013 13:55, Olivier Grisel wrote: > 2013/9/25 Luca Cerone : > >> > (this is not explained in the user guide > >> > > >> > > http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression > , > >> > though). > >> > >> All our classifiers support multiclass classificat

Re: [Scikit-learn-general] Checksum on 0.14.1: altered since release on 2013-08-08 ???

2013-09-25 Thread Olivier Grisel
2013/9/25 Lars Buitinck : > 2013/9/25 Olivier Grisel : >> I don't remember who first did the upload of the 0.14.1 release but it >> is indeed very possible that there was a glitch in the release process >> and the first upload was messed up and that an override was required >> quickly after the fir

Re: [Scikit-learn-general] Multiclass Logistic Regression.

2013-09-25 Thread Olivier Grisel
2013/9/25 Luca Cerone : >> > (this is not explained in the user guide >> > >> > http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression, >> > though). >> >> All our classifiers support multiclass classification and this is >> documented in various places. > > > I am sorry, but

Re: [Scikit-learn-general] Multiclass Logistic Regression.

2013-09-25 Thread Vlad Niculae
>> There are still a few things that are not clear to me from the >> documentation. Can you customize the classifier to perform a different >> decision function? > > You can subclass it and override the decision_function method. While true, this can be misleading. You're just changing the final st

Re: [Scikit-learn-general] Checksum on 0.14.1: altered since release on 2013-08-08 ???

2013-09-25 Thread Lars Buitinck
2013/9/25 Olivier Grisel : > I don't remember who first did the upload of the 0.14.1 release but it > is indeed very possible that there was a glitch in the release process > and the first upload was messed up and that an override was required > quickly after the first upload. Wasn't that the Reut

Re: [Scikit-learn-general] Multiclass Logistic Regression.

2013-09-25 Thread Lars Buitinck
2013/9/25 Luca Cerone : > I am sorry, but I went into the user documentation for logistic regression > and multiclass classification and didn't find any information about it Hm, maybe we should put this in a more prominent place like the tutorial. I'll check the docs if I have time. > for the pen

Re: [Scikit-learn-general] Multiclass Logistic Regression.

2013-09-25 Thread Luca Cerone
> > > (this is not explained in the user guide > > > http://scikit-learn.org/stable/modules/linear_model.html#logistic-regression > , > > though). > > All our classifiers support multiclass classification and this is > documented in various places. > I am sorry, but I went into the user documentat

Re: [Scikit-learn-general] Which scikit-learn contributors share common interests?

2013-09-25 Thread Gilles Louppe
Hi, I have just put together a quick and dirty script that does that. It extracts the number of commits for all developers, for all files on a git directory. It then computes the 3 nearest neighbors for all contributors. See the gist below for code and output. https://gist.github.com/glouppe/6698

Re: [Scikit-learn-general] Multiclass Logistic Regression.

2013-09-25 Thread Lars Buitinck
2013/9/25 Luca Cerone : > This morning I checked the source for LogisticRegression in > sklearn/linear_model/logistic.py and realized that by default it performs > multiclass classification > (this is not explained in the user guide > http://scikit-learn.org/stable/modules/linear_model.html#logisti

Re: [Scikit-learn-general] Multiclass Logistic Regression.

2013-09-25 Thread Luca Cerone
Dear Olivier, thanks for your reply. On 25 September 2013 10:39, Olivier Grisel wrote: > LogisticRegression is a already multiclass classifier by default using > the One vs Rest / All strategy by default (as implemented internally > by liblinear which LogisticRegression is a wrapper of). So you

Re: [Scikit-learn-general] Multiclass Logistic Regression.

2013-09-25 Thread Olivier Grisel
LogisticRegression is a already multiclass classifier by default using the One vs Rest / All strategy by default (as implemented internally by liblinear which LogisticRegression is a wrapper of). So you don't need to use OneVsRest in this case. If you want more info on multiclass reductions here i

Re: [Scikit-learn-general] Checksum on 0.14.1: altered since release on 2013-08-08 ???

2013-09-25 Thread Olivier Grisel
I don't remember who first did the upload of the 0.14.1 release but it is indeed very possible that there was a glitch in the release process and the first upload was messed up and that an override was required quickly after the first upload. pypi gives me 503 currently so that I cannot download t