Re: [scikit-learn] How to determine suitable cluster algo

Matti Viljamaa Fri, 25 Jan 2019 12:34:19 -0800

Also,

Remember that some algos may exhibit “sweet spots” w.r.t. computation time and 
gained accuracy.

So you might want to keep measuring “explained variance”, while you add
complexity to your models. And then do plots of model complexity vs explained
variance.

E.g. in MLPClassifier you’d plot e.g. hidden layers against explained variance
to figure out where adding hidden layers starts to exhibit lesser gain in
explained variance.

Lähetetty Windows 10:n Sähköpostista

Lähettäjä: Matti Viljamaa
Lähetetty: Friday, 25 January 2019 13.43
Vastaanottaja: Scikit-learn mailing list
Aihe: VS: [scikit-learn] How to determine suitable cluster algo

For determining what one can afford computaionally, see e.g.:
https://stackoverflow.com/questions/22443041/predicting-how-long-an-scikit-learn-classification-will-take-to-run
https://www.reddit.com/r/scikit_learn/comments/a746h0/is_there_any_way_to_estimate_how_long_a_given/

Lähetetty Windows 10:n Sähköpostista

Lähettäjä: lampahome
Lähetetty: Friday, 25 January 2019 3.42
Vastaanottaja: Scikit-learn mailing list
Aihe: Re: [scikit-learn] How to determine suitable cluster algo

Maybe the suitable way is try-and-error?

What I'm interesting is that my datasets is very huge and I can't try number of
cluster from 1 to N if I have N samples
That cost too much time for me.

Maybe I should define the initial number of cluster based on execution time?

Then analyze the next step is increase/decrease the number of cluster?

thx

Virus-free. www.avast.com

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] How to determine suitable cluster algo

Reply via email to