​In terms of Machine Learning more generally, I want to give special
recognition to Jake VanderPlas, an astronomer who dives deep into
scikit-learn in some multi-hour Youtube-shared tutorials.

Example:
https://youtu.be/L7R4HUQ-eQ0

His excellent keynote at Pycon2017:
https://youtu.be/ZyjCqQEUa8o

Jake does a super-excellent job of showing off the internal consistency of
the scikit-learn API, where you can basically use the same code while just
swapping in one classifier or regressor for another.

He also speaks the jargon pretty flawlessly, to my ears at least, in terms
of what's a feature (label) and what's an observation etc., going into both
supervised and unsupervised learning scenarios (scikit-learn handles both).

Bravo Jake.

Allen Downey has great complementary tutorials which go deeper into the
statistical thinking behind these ML models.  ThinkBayes is fantastic.

It's tempting to just mindlessly throw models at data looking for a best
fit, and maybe that's all some underpaid cube farmer has time for, but
VanderPlas, along with Downey, wisely counsels against that.

Stats more than most is a minefield of pitfalls, such as overfitting. If
your aim is authentic research, then mindless model-slinging will quickly
come up against its own limitations.  That's the message I keep getting
from experts in the field.

Kirby

PS:  thanks to Steve Holden, I got to visit the astronomy world up close,
the form of the Hubble Space Telescope instrumentation team, eager for
Python knowledge.  These were already programmers, experts with IDL, but
IDL is not the hard currency Python is, in the wider job market.  For many
reasons, astronomers can't put all their eggs in one basket.  The Python
ecosystem has been a godsend.




​
_______________________________________________
Edu-sig mailing list
Edu-sig@python.org
https://mail.python.org/mailman/listinfo/edu-sig

Reply via email to