Re: [Scikit-learn-general] Code contribution: Supervised PCA

2015-07-29 Thread Sebastian Raschka
Out of curiosity, how does supervised PCA compare to LDA (Linear Discriminant Analysis); in a nutshell, what would be the main difference? Best, Sebastian > On Jul 29, 2015, at 5:41 PM, Stylianos Kampakis > wrote: > > Hi Andreas, > > Sure. Actually, the purpose of the model is both regulariz

Re: [Scikit-learn-general] Code contribution: Supervised PCA

2015-07-29 Thread Andreas Mueller
Indeed it sounds interesting but I'd still be curious as to how it compares against elasticnet. On 07/29/2015 05:41 PM, Stylianos Kampakis wrote: Hi Andreas, Sure. Actually, the purpose of the model is both regularization and dimensionality reduction for problems where the number of features

Re: [Scikit-learn-general] Code contribution: Supervised PCA

2015-07-29 Thread Stylianos Kampakis
Hi Andreas, Sure. Actually, the purpose of the model is both regularization and dimensionality reduction for problems where the number of features can be larger than the number of instances (or in any case when there is a large number of features). It is particularly effective when there are lots

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-29 Thread Andreas Mueller
Hm, I'm not entirely sure how score_samples is currently used, but I think it is the probability under a density model. It would "only" change the meaning in so far as it is a conditional distribution over y given x and not x. I'm not totally opposed to adding a new method, though I'm not sure I

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-29 Thread Jan Hendrik Metzen
I am not sure about the name, score_samples would sound a bit strange for a conditional probability in my opinion. And likelihood is also misleading since its actually a conditional probability and not a conditional likelihood (the quantities on the right-hand side of conditioning are fixed and

Re: [Scikit-learn-general] Possible code contribution (Poisson loss)

2015-07-29 Thread Andreas Mueller
Shouldn't that be "score_samples"? Well, it is a conditional likelihood p(y|x), not p(x) or p(x, y). But it is the likelihood of some data given the model. On 07/29/2015 02:58 AM, Jan Hendrik Metzen wrote: > Such a predict_proba_at() method would also make sense for Gaussian > process regression.