Re: [Scikit-learn-general] gridsearchCV - overfitting

2016-05-12 Thread A neuman
y own. GridsearchCV give me just one pool of params, if they are overfitting, i cant use gridsearchCV? Just having problems to understand this. On 12 May 2016 at 13:45, Olivier Grisel wrote: > 2016-05-12 13:02 GMT+02:00 A neuman : > > Thanks for the answer! > > > > but h

Re: [Scikit-learn-general] gridsearchCV - overfitting

2016-05-12 Thread A neuman
Thanks for the answer! but how should i check that its overfitted or not? best, -- Mobile security can be enabling, not merely restricting. Employees who bring their own devices (BYOD) to work are irked by the imposition

[Scikit-learn-general] gridsearchCV - overfitting

2016-05-12 Thread A neuman
Hello everyone, I'm having a bit trouble with the parameters that I've got from gridsearchCV. For example: If i'm using the parameter what i've got from grid seardh CV for example on RF oder k-nn and i test the model on the train set, i get everytime an AUC value abo

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread A neuman
The custom metric, ist just calculating the tanimoto coef. a=x.tolist() b=y.tolist() c=np.count_nonzero(x==y) a1=a.count(1.0) b1=b.count(1.0) return float(c)/(a1 + b1 - c) so im Just counting 1's in x and 1's in y c= are the numer, where 1's are matc

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread A neuman
. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 0. 1. 0. 1. 1. 0. 1. 1. 1. 1. 0.] and so on.. but X should be also containing 1's and 0's. best, On 12 January 2016 at 19:04, A neuman wrote: > Hey, > > I Have an another problem, >

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-12 Thread A neuman
Hey, I Have an another problem, if I'm using my own metric, there are not only the samples in x and y. I'm using a 10 fold cv with k-NN Classifier. My Attributes are only 1's and 0's, but if im printing them out, I'll get: KNeighborsClassifier(metric=myFunc) def myFu

Re: [Scikit-learn-general] k-NN user defined distance

2016-01-08 Thread A neuman
Ah, that helped me a lot!!! So i just write my own function that returns an skalar. This function is used in the metric parameter of the kNN function. Thank you!!! On 9 January 2016 at 03:41, Sebastian Raschka wrote: > You could just need “regular" Python function that outputs a sca

[Scikit-learn-general] k-NN user defined distance

2016-01-08 Thread A neuman
Hello everyone, I actually want to use the KNeighboursClassifier, with my own distances. in the Documentation stands the following: [callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights. I just dont know, how

Re: [Scikit-learn-general] Contributing ensemble selection

2014-01-19 Thread magellane a
Hi I was interested in the implementation of *stacking ensemble meta-estimator*for the scikit-learn project, and as suggested by a previous email on the mailing list I've gone through the source-code of scikit-learn in general and the source code of ensemble methods in specific. I want to

Re: [Scikit-learn-general] Contributing ensemble selection

2013-12-10 Thread magellane a
>>We are still missing a stacking ensemble meta-estimator: >>http://www.machine-learning.martinsewell.com/ensembles/stacking/Wolpert1992.pdf<http://www.machine-learning.martinsewell.com/ensembles/stacking/Wolpert1992.pdf> >> (2748 citations) I would be glad to work on thi

Re: [Scikit-learn-general] Contributing ensemble selection

2013-12-08 Thread magellane a
>> Do you believe that it is a major tool that is very useful in general? I'm not sure it's the best option, but the main motive I had behind sending this is my desire to add new features to the ensemble package of scikit-learn >> Have you had a lot of success using it? I&

[Scikit-learn-general] Contributing ensemble selection

2013-12-08 Thread magellane a
.icdm06long.pdf), it's a simple greedy approach for selecting an ensemble from a library of models of different parameters that maximize a given score function, I would like to implement this as a part of the ensemble package of scikit-learn? I've already implemented an initial implementati

Re: [Scikit-learn-general] Does LinearSVC support probability/soft outputs out of the box?

2013-08-14 Thread A
--- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclic

Re: [Scikit-learn-general] Selective multiclass

2013-08-14 Thread A
> That's not even a very big matrix, it's less than 100MB. Does the error occur even with n_jobs=2? Yes. -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshoo

Re: [Scikit-learn-general] Selective multiclass

2013-08-14 Thread A
sed Sparse Row format> -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free an

Re: [Scikit-learn-general] Selective multiclass

2013-08-13 Thread A
least(n_jobs=2,3,4), correct? -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get star

Re: [Scikit-learn-general] Selective multiclass

2013-08-13 Thread A
16 cores thought should use njobs=-1, e.g. OnevsRestClassifier(SGDClassifier, njobs=-1) training completes in about 20 min] -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting

Re: [Scikit-learn-general] Selective multiclass

2013-08-12 Thread A
> Use the predict_proba method, or decision_function, depending on the > model (for SGD, decision_function always works). Btw., if you're not > doing multilabel, then you don't need OneVsRestClassifier. > Thanks, will give it a shot. On another note, n_jobs > 1 f

[Scikit-learn-general] Selective multiclass

2013-08-12 Thread A
Hello, For my ML problem I am facing a bit of dilemma wrt my solution Problem: Predict a category using a text-classifier for large number of categories. Depending on the category predicted we need some post processing [e.g. get document with URL] and try to predict again

Re: [Scikit-learn-general] Paris Sprint location

2013-07-15 Thread Denis A. Engemann
Hi, I'd like to improve ICA as discussed in #2113. Also I'd like to inquire memory behavior of the decomposition classes to better support their combined application on bog data sets with more than 100k samples. Also I'd like to take a look at API inconsistencies between those cla

[Scikit-learn-general] Factorial analysis

2012-01-19 Thread Joris A.
> > I don't have the Bishop, and I must confess that I am still confused by > the Wikipedia. That said, it doesn't really matter. As long as people > feel confident that it is well defined and useful, it belongs to the > scikit, and I am all for it :). > > Gael > > > Thank you all! Let's hope it w

[Scikit-learn-general] Factorial analysis

2012-01-18 Thread Joris A.
Hello All, Sorry if it's stupid but it's not so obvious to me. Is it possible to perform a factorial analysis with sklearn or do I have to use other libraries? Thanks and regards, Joris -- Keep Your Develo