On 09/28/2018 12:10 PM, Sebastian Raschka wrote:
I think model serialization should be a priority.
There is also the ONNX specification that is gaining industrial adoption and 
that already includes open source exporters for several families of 
scikit-learn models:

https://github.com/onnx/onnxmltools

Didn't know about that. This is really nice! What do you think about referring 
to it under http://scikit-learn.org/stable/modules/model_persistence.html to 
make people aware that this option exists?
Would be happy to add a PR.


I don't think an open source runtime has been announced yet (or they didn't email me like they promised lol).
I'm quite excited about this as well.

Javier:
The problem is not so much storing the "model" but storing how to make predictions. Different versions could act differently on the same data structure - and the data structure could change. Both happen in scikit-learn. So if you want to make sure the right thing happens across versions, you either need to provide serialization and deserialization for every version and conversion between those or you need to provide a way to store the prediction function, which basically means you need a turing-complete language (that's what ONNX does).

We basically said doing the first is not feasible within scikit-learn given our current amount of resources, and no-one
has even tried doing it outside of scikit-learn (which would be possible).
Implementing a complete prediction serialization language (the second option) is definitely outside the scope of sklearn.


_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to