Re: [scikit-learn] Sprint discussion points?

Andreas Mueller Wed, 13 Feb 2019 17:57:05 -0800

Do you have a reference for the logistic regression stability? Is itconvergence warnings?

Happy to discuss the other two issues, though I feel they seem easierthan most of what's on my list.

I have no idea what's going on with OPTICS tbh, and I'll leave it up toyou and the others to decide whether that's something we should discuss.I can try to read up and weigh in but that might not be the mosteffective way to do it.

the sample props is something I left out because I personally don't feelit's a priority compared to all the other things;my students have basically no way to figure out what features thecoefficients in their linear model correspond to, that seems a bit moreimportant to me.

We can put it on the discussion list again, but I'm not superenthusiastic about it.


How should we prioritize things?


On 2/13/19 8:08 PM, Joel Nothman wrote:

Yes, I was thinking the same. I think there are some other core issuesto solve, such as:


* euclidean_distances numerical issues
* commitment to ARM testing and debugging
* logistic regression stability

We should also nut out OPTICS issues or remove it from 0.21. I'm stillkeen on trying to work out sample props (supporting weighted scoringat least), but perhaps I'm being persuaded this will never be atop-priority requirement, and the solutions add much complexity.

On Thu, 14 Feb 2019 at 07:39, Andreas Mueller <[email protected]<mailto:[email protected]>> wrote:


    Hey all.

    Should we collect some discussion points for the sprint?

    There's an unusual amount of core-devs present and I think we
    should seize the opportunity.
    Maybe we should create a page in the wiki or add it to the sprint
    page?

    Things that are high on my list of priorities are:

      * slicing pipelines
      * add get_feature_names to pipelines
      * freezing estimator
      * faster multi-metric scoring
      * fit_transform doing something other than fit.transform
      * imbalance-learn interface / subsampling in pipelines
      * Specifying search spaces and valid hyper parameters
        (https://github.com/scikit-learn/scikit-learn/issues/13031).
      * allowing EstimatorCV-style speed-up in GridSearches
      * storing pandas column names and using them as feature names


    Trying to discuss all of these might be too much, but maybe we can
    figure out a subset and make sure we have sleps to discuss?
    Most of these issues are on the roadmap, issue 13031 is reladed to
    #18 but not directly on the roadmap.

    Thanks,
    Andy
    _______________________________________________
    scikit-learn mailing list
    [email protected] <mailto:[email protected]>
    https://mail.python.org/mailman/listinfo/scikit-learn


_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] Sprint discussion points?

Reply via email to