Re: [Scikit-learn-general] Weighted and Balanced Random Forests

2013-03-19 Thread Manish Amde
I have a follow up question regarding the usage of sample_weights for fitting the RandomForestClassifier. Does the predict_proba method take the sample weights (used during fitting) into account as well? I spent some time trying to understand the _tree.pyc and tree.py files in the codebase but stil

Re: [Scikit-learn-general] generation of a "random" confusion matrix

2013-03-19 Thread Dirk Nachbar
What do you need the random matrix for? A baseline? A simple baseline is to predict the most common class for all obs. On 15 Mar 2013 14:38, wrote: > On Fri, Mar 15, 2013 at 10:24 AM, wrote: > >> Having both margin fixed is an unlikely situation, especially for > >> confusion matrices. Your ca

Re: [Scikit-learn-general] domain of appicability - RandomForest, predict_proba function

2013-03-19 Thread Dirk Nachbar
Hi You are correct. What do you mean by domain of applicability? The class with the maximum probability gets the predicted discrete value when you use fit () On 19 Mar 2013 14:44, wrote: > Dear SciKitLearners, > > does anyone have experience in using RandomForest's predict_proba function > as e

[Scikit-learn-general] Problem with "Faces recognition example using eigenfaces and SVMs"

2013-03-19 Thread Patrick Flaherty
I'm experimenting with the examples/tutorials to get a feel for scikit-learn. "Faces recognition example using eigenfaces and SVMs" example. Windows 7, python 2.73., etc. Installed all the packages to the latest versions and the example was crashing here: > face = np.asarray(imread(file_pat

[Scikit-learn-general] domain of appicability - RandomForest, predict_proba function

2013-03-19 Thread Paul . Czodrowski
Dear SciKitLearners, does anyone have experience in using RandomForest's predict_proba function as estimate for the domain of applicability?` The situation is the following: - data set contains 694 samples, each of which is defined by 94 features - data has 2 classes: class0 and class1 - split i

Re: [Scikit-learn-general] Documentation consistency: Attribute formatting

2013-03-19 Thread Vlad Niculae
>> II. Sometimes if attribute descriptions have multiple lines, a backtick >> is needed at the end of continued lines. I still have no idea why and >> what triggers this. Like I said, sometimes it's needed, sometimes it's >> not. > Backtick? You don't mean backslash? Obviously yes :) > This is no

Re: [Scikit-learn-general] Documentation consistency: Attribute formatting

2013-03-19 Thread Andreas Mueller
On 03/19/2013 02:35 PM, Vlad Niculae wrote: > I. In attributes, unlike in parameters, it's IMPORTANT to have one Are you sure that this is not the case for parameters? > > Good: x : int > Bad: x: int This is done consistently for attributes as well as parameters (at least in every PR I review ;) W

[Scikit-learn-general] Documentation consistency: Attribute formatting

2013-03-19 Thread Vlad Niculae
Hello, I apologize if this has already been discussed. I assume it hasn't, and we should take a decision and write it down. Even if the codebase isn't consistent, we should strive to have at least new PRs following the rules. A while back somebody asked me on IRC what the deal with backticks and

Re: [Scikit-learn-general] Interpolated precision in precision_recall_curve()

2013-03-19 Thread Joel Nothman
I think the below would suffice to interpolate precision (if I've understood correctly). I'm not sure if there's a vectorised way to do it given the existing implementation. if interpolate: for i in range(1, len(precision)): precision[i] = precision[i-1:i+1].max() Equivalently: if in

Re: [Scikit-learn-general] Finding dimentions of faces on an image

2013-03-19 Thread Andreas Mueller
On 03/19/2013 01:55 PM, Fimi wrote: Hi Brian, I will look into this paper in more detail. Thank you for your reply. If I have to use opencv or other wrappers like it that hide SVM behind its interface I will not use it. The purpose of this small project is to learn Support Vector Machines. Su

Re: [Scikit-learn-general] Finding dimentions of faces on an image

2013-03-19 Thread Fimi
Hi Brian,   I will look into this paper in more detail. Thank you for your reply.   If I have to use opencv or other wrappers like it that hide SVM behind its interface I will not use it. The purpose of this small project is to learn Support Vector Machines.     Fimi _

Re: [Scikit-learn-general] Finding dimentions of faces on an image

2013-03-19 Thread Fimi
Hi Gilles,   Thank you very much for your answer.   I did really spend a lot more time on this then I had to. This is because the topic is very interesting to me and I really went at it. I found out the hard way that I could not use the whole image, it was bringing up other noisy items into the

Re: [Scikit-learn-general] Interpolated precision in precision_recall_curve()

2013-03-19 Thread Willi Richert
Sure, I could do that. Would need a couple days before I get to it, though... Regarding your statement that interpolated precision makes only sense on ranked results, are you saying that whenever one is trying to build a non-IR classifier then one should always go with ROC curves? I find myself us

Re: [Scikit-learn-general] Finding dimentions of faces on an image

2013-03-19 Thread Andreas Mueller
Hi Fimi. Is there a reason you are not using the Viola-Jones implemented in OpenCV? I should be available in SimpleCV, too, if you want a nice Python interface. Cheers, Andy On 03/19/2013 05:19 AM, Fimi wrote: Hello, I've got non linear multiclass classification for support vector machines to

Re: [Scikit-learn-general] Finding dimentions of faces on an image

2013-03-19 Thread Brian Holt
As Gilles says, the scanning windows approach is pretty common for object (and face) detection. Have you looked at the Viola Jones paper? It's the standard for face detection and now that we have adaboost classifiers you should be able to knock up an example quite quickly. Scikit Image might be qui

Re: [Scikit-learn-general] Finding dimentions of faces on an image

2013-03-19 Thread Gilles Louppe
Hi, Short answer: you cant. Longer answer: If you use as training samples the whole images (with faces somewhere in there), then your model is learning to discriminate between your 2 categories, from the whole images, with **no** information about where the faces are actually located. As such, it

Re: [Scikit-learn-general] Interpolated precision in precision_recall_curve()

2013-03-19 Thread Willi Richert
Sure, I could do that. Would need a couple days before I get to it, though... Regarding your statement that interpolated precision makes only sense on ranked results, are you saying that whenever one is trying to build a non-IR classifier then one should always go with ROC curves? I find myself us