Re: Welcoming some new committers

2018-03-05 Thread Seth Hendrickson
Thanks all! :D On Mon, Mar 5, 2018 at 9:01 AM, Bryan Cutler wrote: > Thanks everyone, this is very exciting! I'm looking forward to working > with you all and helping out more in the future. Also, congrats to the > other committers as well!! >

Re: New to dev community | Contribution to Mlib

2017-09-20 Thread Seth Hendrickson
I'm not exactly clear on what you're proposing, but this sounds like something that would live as a Spark package - a framework for anomaly detection built on Spark. If there is some specific algorithm you have in mind, it would be good to propose it on JIRA and discuss why you think it needs to

Re: MLlib mission and goals

2017-01-31 Thread Seth Hendrickson
I agree with what Sean said about not supporting arbitrarily many algorithms. I think the goal of MLlib should be to support only core algorithms for machine learning. Ideally Spark ML provides a relatively small set of algorithms that are heavily optimized, and also provides a framework that

Re: Feedback on MLlib roadmap process proposal

2017-01-19 Thread Seth Hendrickson
I think the proposal laid out in SPARK-18813 is well done, and I do think it is going to improve the process going forward. I also really like the idea of getting the community to vote on JIRAs to give some of them priority - provided that we listen to those votes, of course. The biggest problem I

Re: Regularized Logistic regression

2016-10-13 Thread Seth Hendrickson
Spark MLlib provides a cross-validation toolkit for selecting hyperparameters. I think you'll find the documentation quite helpful: http://spark.apache.org/docs/latest/ml-tuning.html#example-model-selection-via-cross-validation There is actually a python example for logistic regression there. If