On Fri, Sep 28, 2018 at 1:03 AM Sebastian Raschka <m...@sebastianraschka.com>

> Chris Emmery, Chris Wagner and I toyed around with JSON a while back (
> https://cmry.github.io/notes/serialize), and it could be feasible

I came across your notes a while back, they were really useful!
I hacked a variation of it that didn't need to know the model class in
but is is VERY hackish, and it doesn't work with complex models with nested
components. (At work we use a further variation of this that also works on
pipelines and some specific nested stuff, like `mlxtend`'s

> but yeah, it will involve some work, especially with testing things
> thoroughly for all kinds of estimators. Maybe this could somehow be
> automated though in a grid-search kind of way with a build matrix for
> estimators and parameters once a general framework has been developed.

I considered making this serialization into an external project, but I
think this would be much easier if estimators provided a dunder method
`__serialize__` (or whatever) that would handle the idiosyncrasies of each
particular family, I don't believe there will be a "one-size-fits-all"
solution for this problem. This approach would also make it possible to
work on it incrementally, raising a default `NotImplementedError` for
estimators that haven't been addressed yet.

In the long run, I also believe that the "proper" way to do this is to
allow dumping entire processes into PFA: http://dmg.org/pfa/docs/motivation/
scikit-learn mailing list

Reply via email to