On Fri, Sep 28, 2018 at 1:03 AM Sebastian Raschka <m...@sebastianraschka.com> wrote:
> Chris Emmery, Chris Wagner and I toyed around with JSON a while back ( > https://cmry.github.io/notes/serialize), and it could be feasible I came across your notes a while back, they were really useful! I hacked a variation of it that didn't need to know the model class in advance: https://gist.github.com/jlopezpena/2cdd09c56afda5964990d5cf278bfd31 but is is VERY hackish, and it doesn't work with complex models with nested components. (At work we use a further variation of this that also works on pipelines and some specific nested stuff, like `mlxtend`'s `SequentialFeatureSelector`) > but yeah, it will involve some work, especially with testing things > thoroughly for all kinds of estimators. Maybe this could somehow be > automated though in a grid-search kind of way with a build matrix for > estimators and parameters once a general framework has been developed. > I considered making this serialization into an external project, but I think this would be much easier if estimators provided a dunder method `__serialize__` (or whatever) that would handle the idiosyncrasies of each particular family, I don't believe there will be a "one-size-fits-all" solution for this problem. This approach would also make it possible to work on it incrementally, raising a default `NotImplementedError` for estimators that haven't been addressed yet. In the long run, I also believe that the "proper" way to do this is to allow dumping entire processes into PFA: http://dmg.org/pfa/docs/motivation/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn