Also, Remember that some algos may exhibit “sweet spots” w.r.t. computation time and gained accuracy.
So you might want to keep measuring “explained variance”, while you add complexity to your models. And then do plots of model complexity vs explained variance. E.g. in MLPClassifier you’d plot e.g. hidden layers against explained variance to figure out where adding hidden layers starts to exhibit lesser gain in explained variance. Lähetetty Windows 10:n Sähköpostista Lähettäjä: Matti Viljamaa Lähetetty: Friday, 25 January 2019 13.43 Vastaanottaja: Scikit-learn mailing list Aihe: VS: [scikit-learn] How to determine suitable cluster algo For determining what one can afford computaionally, see e.g.: https://stackoverflow.com/questions/22443041/predicting-how-long-an-scikit-learn-classification-will-take-to-run https://www.reddit.com/r/scikit_learn/comments/a746h0/is_there_any_way_to_estimate_how_long_a_given/ Lähetetty Windows 10:n Sähköpostista Lähettäjä: lampahome Lähetetty: Friday, 25 January 2019 3.42 Vastaanottaja: Scikit-learn mailing list Aihe: Re: [scikit-learn] How to determine suitable cluster algo Maybe the suitable way is try-and-error? What I'm interesting is that my datasets is very huge and I can't try number of cluster from 1 to N if I have N samples That cost too much time for me. Maybe I should define the initial number of cluster based on execution time? Then analyze the next step is increase/decrease the number of cluster? thx Virus-free. www.avast.com --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn