Also,

Remember that some algos may exhibit “sweet spots” w.r.t. computation time and 
gained accuracy.

So you might want to keep measuring “explained variance”, while you add 
complexity to your models. And then do plots of model complexity vs explained 
variance.

E.g. in MLPClassifier you’d plot e.g. hidden layers against explained variance 
to figure out where adding hidden layers starts to exhibit lesser gain in 
explained variance.

Lähetetty Windows 10:n Sähköpostista

Lähettäjä: Matti Viljamaa
Lähetetty: Friday, 25 January 2019 13.43
Vastaanottaja: Scikit-learn mailing list
Aihe: VS: [scikit-learn] How to determine suitable cluster algo

For determining what one can afford computaionally, see e.g.:
https://stackoverflow.com/questions/22443041/predicting-how-long-an-scikit-learn-classification-will-take-to-run
https://www.reddit.com/r/scikit_learn/comments/a746h0/is_there_any_way_to_estimate_how_long_a_given/

Lähetetty Windows 10:n Sähköpostista

Lähettäjä: lampahome
Lähetetty: Friday, 25 January 2019 3.42
Vastaanottaja: Scikit-learn mailing list
Aihe: Re: [scikit-learn] How to determine suitable cluster algo

Maybe the suitable way is try-and-error?

What I'm interesting is that my datasets is very huge and I can't try number of 
cluster from 1 to N if I have N samples
That cost too much time for me.

Maybe I should define the initial number of cluster based on execution time?

Then analyze the next step is increase/decrease the number of cluster?

thx



Virus-free. www.avast.com



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to