Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-03 Thread Nick Pentreath
For ONNX you may be interested in https://github.com/onnx/onnxmltools - which supports conversion of a few skelarn models to ONNX already. However as far as I am aware, none of the ONNX backends actually support the ONNX-ML extended spec (in open-source at least). So you would not be able to

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-03 Thread Sebastian Raschka
The ONNX-approach sounds most promising, esp. because it will also allow library interoperability but I wonder if this is for parametric models only and not for the nonparametric ones like KNN, tree-based classifiers, etc. All-in-all I can definitely see the appeal for having a way to export

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-03 Thread Javier López
On Tue, Oct 2, 2018 at 5:07 PM Gael Varoquaux wrote: > The reason that pickles are brittle and that sharing pickles is a bad > practice is that pickle use an implicitly defined data model, which is > defined via the internals of objects. > Plus the fact that loading a pickle can execute

[scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-03 Thread Alex Garel
Le 02/10/2018 à 16:46, Andreas Mueller a écrit : > Thank you for your feedback Alex! Thanks for answering ! > > On 10/02/2018 09:28 AM, Alex Garel wrote: >> >> * chunk processing (kind of handling streaming data) :  when >> dealing with lot of data, the ability to fit_partial, then use >>

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-02 Thread Gael Varoquaux
On Tue, Oct 02, 2018 at 12:20:40PM -0400, Andreas Mueller wrote: > I think having solution is to have MS, FB, Amazon, IBM, Nvidia, intel,... > maintain our generic persistent code is a decent deal for us if it works out > ;) > https://onnx.ai/ I'll take that deal! :) +1 for onnx, absolutely!

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-02 Thread Andreas Mueller
On 10/02/2018 12:01 PM, Gael Varoquaux wrote: So, the problems of pickle are not specific to pickle, but rather intrinsic to any generic persistence code [*]. Writing persistence code that does not fall in these problems is very costly in terms of developer time and makes it harder to add new

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-02 Thread Gael Varoquaux
On Fri, Sep 28, 2018 at 09:45:16PM +0100, Javier López wrote: > This is not the whole truth. Yes, you store the sklearn version on the pickle > and raise a warning; I am mostly ok with that, but the pickles are brittle and > oftentimes they stop loading when other versions of other stuff change. I

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-02 Thread Andreas Mueller
Thank you for your feedback Alex! On 10/02/2018 09:28 AM, Alex Garel wrote: * chunk processing (kind of handling streaming data) :  when dealing with lot of data, the ability to fit_partial, then use transform on chunks of data is of good help. But it's not well exposed in

[scikit-learn] [ANN] Scikit-learn 0.20.0

2018-10-02 Thread Alex Garel
Le 26/09/2018 à 21:59, Joel Nothman a écrit : > And for those interested in what's in the pipeline, we are trying to > draft a > roadmap...  > https://github.com/scikit-learn/scikit-learn/wiki/Draft-Roadmap-2018 Hello, First of all thanks for the incredible work on scikit-learn. I found the

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Andreas Mueller
On 09/28/2018 04:45 PM, Javier López wrote: On Fri, Sep 28, 2018 at 8:46 PM Andreas Mueller > wrote: Basically what you're saying is that you're fine with versioning the models and having the model break loudly if anything changes. That's not actually

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Javier López
On Fri, Sep 28, 2018 at 8:46 PM Andreas Mueller wrote: > Basically what you're saying is that you're fine with versioning the > models and having the model break loudly if anything changes. > That's not actually what most people want. They want to be able to make > predictions with a given model

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Andreas Mueller
On 09/28/2018 03:20 PM, Javier López wrote: I understand the difficulty of the situation, but an approximate solution to that is saving the predictions from a large enough validation set. If the prediction for the newly created model are "close enough" to the old ones, we deem the

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Javier López
On Fri, Sep 28, 2018 at 6:41 PM Andreas Mueller wrote: > Javier: > The problem is not so much storing the "model" but storing how to make > predictions. Different versions could act differently > on the same data structure - and the data structure could change. Both > happen in scikit-learn. >

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Manuel CASTEJÓN LIMAS via scikit-learn
How about a docker based approach? Just thinking out loud Best Manuel El vie., 28 sept. 2018 19:43, Andreas Mueller escribió: > > > On 09/28/2018 01:38 PM, Andreas Mueller wrote: > > > > > > On 09/28/2018 12:10 PM, Sebastian Raschka wrote: > I think model serialization should be a

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Andreas Mueller
On 09/28/2018 01:38 PM, Andreas Mueller wrote: On 09/28/2018 12:10 PM, Sebastian Raschka wrote: I think model serialization should be a priority. There is also the ONNX specification that is gaining industrial adoption and that already includes open source exporters for several families

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Andreas Mueller
On 09/28/2018 12:10 PM, Sebastian Raschka wrote: I think model serialization should be a priority. There is also the ONNX specification that is gaining industrial adoption and that already includes open source exporters for several families of scikit-learn models:

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Sebastian Raschka
> > > I think model serialization should be a priority. > > There is also the ONNX specification that is gaining industrial adoption and > that already includes open source exporters for several families of > scikit-learn models: > > https://github.com/onnx/onnxmltools Didn't know about

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Manuel CASTEJÓN LIMAS via scikit-learn
Huge huge Thank you developers! Keep up the good work! El mié., 26 sept. 2018 20:57, Andreas Mueller escribió: > Hey everbody! > I'm happy to (finally) announce scikit-learn 0.20.0. > This release is dedicated to the memory of Raghav Rajagopalan. > > You can upgrade now with pip or conda! > >

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Olivier Grisel
> > > > I think model serialization should be a priority. > There is also the ONNX specification that is gaining industrial adoption and that already includes open source exporters for several families of scikit-learn models: https://github.com/onnx/onnxmltools -- Olivier

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-28 Thread Javier López
On Fri, Sep 28, 2018 at 1:03 AM Sebastian Raschka wrote: > Chris Emmery, Chris Wagner and I toyed around with JSON a while back ( > https://cmry.github.io/notes/serialize), and it could be feasible I came across your notes a while back, they were really useful! I hacked a variation of it that

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-27 Thread Sebastian Raschka
Congrats everyone, this is awesome!!! I just started teaching an ML course this semester and introduced scikit-learn this week -- it was a great timing to demonstrate how well maintained the library is and praise all the efforts that go into it :). > I think model serialization should be a

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-27 Thread Javier López
First of all, congratulations on the release, great work, everyone! I think model serialization should be a priority. Particularly, I think that (whenever practical) there should be a way of serializing estimators (either unfitted or fitted) in a text-readable format, prefereably JSON or PMML/PFA

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-27 Thread Andreas Mueller
I think we should work on the formatting, make sure it's complete, link it to issues /PRs and then make this into a public document on the website and request feedback. Right now it's a bit in a format that is understandable for core-developers but some of the things are not clear to the

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-27 Thread Olivier Grisel
Le mer. 26 sept. 2018 à 23:02, Joel Nothman a écrit : > And for those interested in what's in the pipeline, we are trying to draft > a roadmap... > https://github.com/scikit-learn/scikit-learn/wiki/Draft-Roadmap-2018 > > But there are no doubt many features that are absent there too. > Indeed,

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-27 Thread Olivier Grisel
Joy ! ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-27 Thread Aiden Nguyen
Congrat all team! Aiden Nguyen -- Nguyen Thien Bao, PhD Director and Founder, HBB Tech, Vietnam Co-founder, HBB Solutions, Vietnam Head, R Division, Cardano Labo, Vietnam NeuroInformatics Laboratory (NILab), Fondazione Bruno Kessler (FBK), Trento, Italy Centro Interdipartimentale Mente e Cervello

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-26 Thread Denis-Alexander Engemann
This is wonderful news! Congrats everyone. I can‘t wait to check out the game changing column transformer! Denis On Wed 26 Sep 2018 at 23:45, Gael Varoquaux wrote: > Hurray, thanks to everybody; in particular for those who did the hard > work of ironing out the last issues and releasing. > >

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-26 Thread Gael Varoquaux
Hurray, thanks to everybody; in particular for those who did the hard work of ironing out the last issues and releasing. Gaël On Wed, Sep 26, 2018 at 02:55:57PM -0400, Andreas Mueller wrote: > Hey everbody! > I'm happy to (finally) announce scikit-learn 0.20.0. > This release is dedicated to the

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-26 Thread Joel Nothman
And for those interested in what's in the pipeline, we are trying to draft a roadmap... https://github.com/scikit-learn/scikit-learn/wiki/Draft-Roadmap-2018 But there are no doubt many features that are absent there too. ___ scikit-learn mailing list

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-26 Thread Andreas Mueller
On 09/26/2018 04:49 PM, Joel Nothman wrote: Wow. It's finally out!! Thank you to the cast of thousands, but to also some individuals for real dedication and insight! Yet there's so much more still in the pipeline. If we're clever about things, we'll make the next release cycle shorter and

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-26 Thread Joel Nothman
Wow. It's finally out!! Thank you to the cast of thousands, but to also some individuals for real dedication and insight! Yet there's so much more still in the pipeline. If we're clever about things, we'll make the next release cycle shorter and the release more manageable.

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-26 Thread Raga Markely
Congratulations! Thank you very much for everyone's hard work! Raga On Wed, Sep 26, 2018, 2:57 PM Andreas Mueller wrote: > Hey everbody! > I'm happy to (finally) announce scikit-learn 0.20.0. > This release is dedicated to the memory of Raghav Rajagopalan. > > You can upgrade now with pip or

Re: [scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-26 Thread bthirion
Congratulations ! Bertrand On 26/09/2018 20:55, Andreas Mueller wrote: Hey everbody! I'm happy to (finally) announce scikit-learn 0.20.0. This release is dedicated to the memory of Raghav Rajagopalan. You can upgrade now with pip or conda! There is many important additions and updates, and

[scikit-learn] [ANN] Scikit-learn 0.20.0

2018-09-26 Thread Andreas Mueller
Hey everbody! I'm happy to (finally) announce scikit-learn 0.20.0. This release is dedicated to the memory of Raghav Rajagopalan. You can upgrade now with pip or conda! There is many important additions and updates, and you can find the full release notes here: