Hi CD,

Can we take a random sample from the test data and use that for this
process?

--Srianth

On Fri, May 22, 2015 at 12:00 PM, CD Athuraliya <chathur...@wso2.com> wrote:

> Hi all,
>
> To implement $subject in ML we need all feature values of the dataset
> against predicted and actual values for test data. But Spark only returns
> predicted and actual values as test results. Right now we use random 10,000
> data rows for other visualizations and we cannot use same data for this
> visualization since that random 10,000 data does not correspond to test
> data (test data is a subtracted from dataset according to the train data
> fraction at model building stage).
>
> One option is to persist test data at testing stage, but it can be too
> large for some datasets according to train data fraction. Appreciate if you
> can give your comments on this.
>
> Thanks,
> CD
>
> --
> *CD Athuraliya*
> Software Engineer
> WSO2, Inc.
> lean . enterprise . middleware
> Mobile: +94 716288847 <94716288847>
> LinkedIn <http://lk.linkedin.com/in/cdathuraliya> | Twitter
> <https://twitter.com/cdathuraliya> | Blog
> <http://cdathuraliya.tumblr.com/>
>



-- 
============================
Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
Site: http://people.apache.org/~hemapani/
Photos: http://www.flickr.com/photos/hemapani/
Phone: 0772360902
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to