Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread Jan Hendrik Metzen
Such a predict_proba_at() method would also make sense for Gaussian process regression. Currently, computing probability densities for GPs requires predicting mean and standard deviation (via "MSE") at X and using scipy.stats.norm.pdf to compute probability densities for y for the predicted mea

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread Mathieu Blondel
Regarding predictions, I don't really see what's the problem. Using GLMs as an example, you just need to do def predict(self, X): if self.loss == "poisson": return np.exp(np.dot(X, self.coef_)) else: return np.dot(X, self.coef_) A nice thing about Poisson regression is tha

Re: [Scikit-learn-general] Contribution

2015-07-28 Thread Gael Varoquaux
On Tue, Jul 28, 2015 at 08:59:15PM +0200, bthirion wrote: > Note that Jacob Schreiber is already working on it. > https://github.com/scikit-learn/scikit-learn/pull/5010 Indeed, and he has a few other PRs going in this direction. You should interact on the PRs with him. ---

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread Andreas Mueller
I was expecting there to be the actual poisson loss implemented in the class, not just a log transform. On 07/28/2015 02:03 PM, josef.p...@gmail.com wrote: Just a comment from the statistics sidelines taking log of target and fitting a linear or other model doesn't make it into a Poisson mod

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread josef.pktd
Just a comment from the statistics sidelines taking log of target and fitting a linear or other model doesn't make it into a Poisson model. But maybe "Poisson loss" in machine learning is unrelated to the Poisson distribution or a Poisson model with E(y| x) = exp(x beta). ? Josef On Tue, Jul 2

Re: [Scikit-learn-general] Contribution

2015-07-28 Thread bthirion
On 28/07/2015 20:11, Andreas Mueller wrote: What do you mean by "you need topics which are to be implemented from scratch"? Need for what? Do you want to help the project or do you want to implement an algorithm? If you like you could try to improve the speed of the gradient boosting trees. I

Re: [Scikit-learn-general] Contribution

2015-07-28 Thread Andreas Mueller
Have a look at https://github.com/scikit-learn/scikit-learn/pull/5041 btw. On 07/28/2015 01:36 PM, Sreenivas Raghavan wrote: Thank you for the idea. i will start right away. On Tue, Jul 28, 2015 at 11:41 PM, Andreas Mueller > wrote: What do you mean by "you need t

Re: [Scikit-learn-general] Big Data Mining

2015-07-28 Thread Andreas Mueller
My personal recommendation is to consider other options if your data is >1tb but I highly it depends on your application. Gael and Olivier you use it also for larger data, right? On 07/24/2015 03:25 AM, Gael Varoquaux wrote: > Because this is a question that comes up often, I have tried to give

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-28 Thread Andreas Mueller
I'd be happy with adding Poisson loss to more models, thought I think it would be more natural to first add it to GLM before GBM ;) If the addition is straight-forward, I think it would be a nice contribution nevertheless. 1) for the user to do np.exp(gbmpoisson.predict(X)) is not acceptable. Th

Re: [Scikit-learn-general] Problem with user-defined kernel in SVM

2015-07-28 Thread Andreas Mueller
For reference, it was answered there: http://stackoverflow.com/questions/31599624/user-defined-svm-kernel-with-scikit-learn On 07/23/2015 07:33 PM, Vincent Leclère wrote: Hello everybody, I'm encountering a trouble fact dealing with sklearn.svm user defined kernel. Here is a minimal example

Re: [Scikit-learn-general] Contribution

2015-07-28 Thread Sreenivas Raghavan
Thank you for the idea. i will start right away. On Tue, Jul 28, 2015 at 11:41 PM, Andreas Mueller wrote: > What do you mean by "you need topics which are to be implemented from > scratch"? > Need for what? Do you want to help the project or do you want to implement > an algorithm? > > If you l

Re: [Scikit-learn-general] About contributing code

2015-07-28 Thread Sreenivas Raghavan
I am sorry. But i posted in the issue 3 days ago and I didn't get reply. I was unable to comprehend whether the issue was resolved or the issue was left. On Tue, Jul 28, 2015 at 11:24 PM, Andreas Mueller wrote: > You posted on the ML yesterday. This is mostly a volunteer-run project. > I try to

Re: [Scikit-learn-general] Code contribution: Supervised PCA

2015-07-28 Thread Andreas Mueller
Hi Stylianos. Can you give a bit more background on the model? It seems fairly well-cited but I haven't really seen it in practice. Is it still state of the art? The main purpose seems to be a particular type of regularization, right, not supervised dimensionality reduction? How does this compar

Re: [Scikit-learn-general] Contribution

2015-07-28 Thread Andreas Mueller
What do you mean by "you need topics which are to be implemented from scratch"? Need for what? Do you want to help the project or do you want to implement an algorithm? If you like you could try to improve the speed of the gradient boosting trees. I think this is a worthwhile but non-trivial u

Re: [Scikit-learn-general] does sklearn rbm scale well with sparse high dimensional features

2015-07-28 Thread Andreas Mueller
Have a look at Russ Salakhutdinov's thesis for work on density modelling. The problem is that it is impossible to compute the partition function, and therefore you can only get unnormalized densities. On 07/27/2015 12:49 PM, Mika S wrote: Thanks, this is helpful. I have seen RBMs only in pret

Re: [Scikit-learn-general] About contributing code

2015-07-28 Thread Andreas Mueller
You posted on the ML yesterday. This is mostly a volunteer-run project. I try to respond to everything within at least a day, but I was travelling the last weeks. If you feel a day is not an acceptable response time, I am afraid you will not be happy with this project. On 07/28/2015 12:52 P

Re: [Scikit-learn-general] About contributing code

2015-07-28 Thread Sreenivas Raghavan
i, i am new to scikit-learn. I am intersted to contribute either by introducing some new algorithms or optimizing existing algorithms. I have gone through the issue-tracker and found few interesting topics to work on like NCA anf NMF. I have posted my question as what is the status of the work a

Re: [Scikit-learn-general] updating a model

2015-07-28 Thread Ady Wahyudi Paundu
Hi all, I found this paper on adaptive one-class SVM http://www.tsc.uc3m.es/~vanessa/publicaciones/articulos_revista/TSP_2011.pdf ~Ady -- ___ Scikit-learn-general mailing list

Re: [Scikit-learn-general] About contributing code

2015-07-28 Thread Andreas Mueller
Hi Gryllos. Before contributing a new feature (which is usually a major undertaking) it us usually a good idea to get started working on known issues, have a look at the issue tracker. I'm not familiar with the feature line approach. Can you elaborate and provide a reference? Please see the

Re: [Scikit-learn-general] Added sample_weight to RFECV.fit but not sure how to test the change

2015-07-28 Thread Andy
Hi Dale. Please keep all discussions on the mailing list as not everybody might have the time to reply. The default should be class_weight=1 for each class, so dropping the half in one class should reduce the weight for that class to .5. This only works for removing duplicate data points (droppi