Re: [Scikit-learn-general] Flexible Naive Bayes

2014-06-11 Thread Lars Buitinck
2014-06-11 18:16 GMT+02:00 Gavin Gray : > Yeah, you'd have to hand in a ve ctor listing which distribution to use for > each element in the feature vector. Weka might have a way round this, but > I'll have to try using it to see what the interface is like. They reference > a paper that estimates th

Re: [Scikit-learn-general] Flexible Naive Bayes

2014-06-11 Thread Gavin Gray
Yeah, you'd have to hand in a vector listing which distribution to use for each element in the feature vector. Weka might have a way round this, but I'll have to try using it to see what the interface is like. They reference a paper that estimates the distribution of each feature using KDE: http://

[Scikit-learn-general] Feature weighting in Nearest Neighbor Regression

2014-06-11 Thread George Bezerra
Hi, I was wondering if KNN weighs features differently when doing regression and, if not, would it be possible to do so? I would like to find the feature weights that minimiz error. Also, what other kind of pre-processing (such as scaling and normalization) does KNN do? Finally, where do I find

Re: [Scikit-learn-general] Multilabel and differences betweeen 0.14 and Master

2014-06-11 Thread Matthieu Brucher
Hi, Did you update Numpy at the same time, by any chance? There is a discussion on the Numpy ML about a similar message on the latest beta. Cheers, Matthieu 2014-06-11 16:46 GMT+01:00 Miguel Fernando Cabrera : > Hi Joel, Arnaud, > > Thanks for the answer. In fact I am splitting the data using a

Re: [Scikit-learn-general] Multilabel and differences betweeen 0.14 and Master

2014-06-11 Thread Miguel Fernando Cabrera
> > Hi Joel, Arnaud, Thanks for the answer. In fact I am splitting the data using another approach. Yes I now realize that StratifiedKFold does not make sense here. But the weird thing is that in 0.14 it does not even complain. Best Regards, -- Miguel --

Re: [Scikit-learn-general] Flexible Naive Bayes

2014-06-11 Thread Lars Buitinck
2014-06-11 15:54 GMT+02:00 Gavin Gray : > I need to use Naive Bayes for mixed categorial and numerical data and was > thinking of implementing a flexible Naive Bayes algorithm similar to Weka's > instead of hacking my way around by converting the numerical to categorical > or similar. Is there a go

[Scikit-learn-general] Flexible Naive Bayes

2014-06-11 Thread Gavin Gray
Hi, I need to use Naive Bayes for mixed categorial and numerical data and was thinking of implementing a flexible Naive Bayes algorithm similar to Weka's instead of hacking my way around by converting the numerical to categorical or similar. Is there a good reason I shouldn't do this? Is anyone el

Re: [Scikit-learn-general] Multilabel and differences betweeen 0.14 and Master

2014-06-11 Thread Joel Nothman
This has nothing to do with multilabel representation; Stratified K Fold will not work over multilabel data. But the error message should be clearer. On 11 June 2014 00:53, Miguel Fernando Cabrera wrote: > Hi Everyone, > > This is my first post in the list. I have been using scikit-learn active

Re: [Scikit-learn-general] Ridge regression only working with huge alpha values?

2014-06-11 Thread Michael Eickenberg
Hi Chris, your observation is at least partially due to scaling differences between the losses of the classifiers. Whereas `SGDRegressor` by construction puts an extra 1/n_samples in front of your data fit term, `Ridge` does not. So the penalties used will differ by at least a factor n_samples (se