+1 for residual plots. Though I haven't used it myself Residual Plot is a useful diagnostic tool for regression models. Especially, non-linearity in regression models can be easily identified using it.
"An Introduction to Statistical Learning" book [1] ( page 92-96) contains some useful information about residual plots. [1]. http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Fourth%20Printing.pdf On Tue, May 26, 2015 at 8:47 PM, Supun Sethunga <sup...@wso2.com> wrote: > Hi CD, > > As it pops up in the offline discussion as well, IMHO, for > classifications, this plot may not be the best option. But for regression, > we can actually use this plot but with a slight modification, that is > taking the difference of the predicted and actual (rather than the values > it self), and plot that, against a predictor variable (just like its been > done atm). We can also add a third variable (categorical feature) to color > the points. This is a standard plot (AKA Residual plot) which is usually > use to evaluate regression models. > > One other thing we can try out is, doing the same for classification as > well. i.e: Taking the difference between the actual probability (o or 1) > and the predicted probability, and plot that, and see whether it gives a > better overall picture. Not sure how will it come out though :) If it comes > right, then any point lies above 0.5 (or the threshold we used) is wrongly > classified, and hence we can get a rough idea, on for which values of > x-axis feature, does the points get wrongly classified. I mean, we should > be able to see any pattern, if there exists. > > Thanks, > Supun > > On Tue, May 26, 2015 at 6:08 PM, CD Athuraliya <chathur...@wso2.com> > wrote: > >> Hi, >> >> Plotting predicted and actual values against a feature doesn't look very >> intuitive, specially for non-probabilistic models. Please check the >> attachments. Any thoughts on making this visualization better? >> >> Thanks >> >> On Fri, May 22, 2015 at 3:27 PM, Srinath Perera <srin...@wso2.com> wrote: >> >>> yes, rerun using a random sample from test data is OK. >>> >>> --Srinath >>> >>> On Fri, May 22, 2015 at 2:28 PM, CD Athuraliya <chathur...@wso2.com> >>> wrote: >>> >>>> Hi Srinath, >>>> >>>> Still that random sample will not correspond to predicted vs. actual >>>> values in test results. Given that there is no mapping between random >>>> sample data points and test result points. One thing we can do is running >>>> test separately (using the same model) for sampled data for the sole >>>> purpose of visualization. Any other options? >>>> >>>> On Fri, May 22, 2015 at 2:06 PM, Srinath Perera <srin...@wso2.com> >>>> wrote: >>>> >>>>> Hi CD, >>>>> >>>>> Can we take a random sample from the test data and use that for this >>>>> process? >>>>> >>>>> --Srianth >>>>> >>>>> On Fri, May 22, 2015 at 12:00 PM, CD Athuraliya <chathur...@wso2.com> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> To implement $subject in ML we need all feature values of the dataset >>>>>> against predicted and actual values for test data. But Spark only returns >>>>>> predicted and actual values as test results. Right now we use random >>>>>> 10,000 >>>>>> data rows for other visualizations and we cannot use same data for this >>>>>> visualization since that random 10,000 data does not correspond to test >>>>>> data (test data is a subtracted from dataset according to the train data >>>>>> fraction at model building stage). >>>>>> >>>>>> One option is to persist test data at testing stage, but it can be >>>>>> too large for some datasets according to train data fraction. Appreciate >>>>>> if >>>>>> you can give your comments on this. >>>>>> >>>>>> Thanks, >>>>>> CD >>>>>> >>>>>> -- >>>>>> *CD Athuraliya* >>>>>> Software Engineer >>>>>> WSO2, Inc. >>>>>> lean . enterprise . middleware >>>>>> Mobile: +94 716288847 <94716288847> >>>>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>>>>> <https://twitter.com/cdathuraliya> | Blog >>>>>> <http://cdathuraliya.tumblr.com/> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> ============================ >>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >>>>> Site: http://people.apache.org/~hemapani/ >>>>> Photos: http://www.flickr.com/photos/hemapani/ >>>>> Phone: 0772360902 >>>>> >>>> >>>> >>>> >>>> -- >>>> *CD Athuraliya* >>>> Software Engineer >>>> WSO2, Inc. >>>> lean . enterprise . middleware >>>> Mobile: +94 716288847 <94716288847> >>>> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >>>> <https://twitter.com/cdathuraliya> | Blog >>>> <http://cdathuraliya.tumblr.com/> >>>> >>> >>> >>> >>> -- >>> ============================ >>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >>> Site: http://people.apache.org/~hemapani/ >>> Photos: http://www.flickr.com/photos/hemapani/ >>> Phone: 0772360902 >>> >> >> >> >> -- >> *CD Athuraliya* >> Software Engineer >> WSO2, Inc. >> lean . enterprise . middleware >> Mobile: +94 716288847 <94716288847> >> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter >> <https://twitter.com/cdathuraliya> | Blog >> <http://cdathuraliya.tumblr.com/> >> > > > > -- > *Supun Sethunga* > Software Engineer > WSO2, Inc. > http://wso2.com/ > lean | enterprise | middleware > Mobile : +94 716546324 > -- Upul Bandara, Associate Technical Lead, WSO2, Inc., Mob: +94 715 468 345.
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev