Hello,
This is my first post to this list, so I hope I won't violate any
(un)written rules.
I recently started working with SparkNLP for a larger project. SparkNLP
in turn is based Apache Spark's MLlib. One thing I found missing is the
ability to store custom parameters in a Spark pipeline. It seems only
certain pre-configured parameter values are allowed (e.g. "stages" for
the Pipeline class).
IMHO, it would be handy to be able to store custom parameters, e.g. for
model versions or other meta-data, so that these parameters are stored
with a trained pipeline, for instance. This could also be used to
include evaluation results, such as accuracy, with trained ML models.
(I also asked this on Stackoverflow, but didn't get a response, yet:
https://stackoverflow.com/questions/69627820/setting-custom-parameters-for-a-spark-mllib-pipeline)
Would does the community think about this proposal? Has it been
discussed before perhaps? Any thoughts?
Cheers,
Martin
- Feature (?): Setting custom parameters for a Spark MLlib pipel... martin
-