Re: Feature (?): Setting custom parameters for a Spark MLlib pipeline

2021-11-11 Thread martin
Yes, that would be a suitable option. We could just extend the standard Spark MLLib Transformer and add the required meta-data. Just out of curiosity: Is there a specific reason for why the user of a standard Transform would not be able to add arbitrary key-value pairs for additional

Re: Feature (?): Setting custom parameters for a Spark MLlib pipeline

2021-10-25 Thread Sean Owen
You can write a custom Transformer or Estimator? On Mon, Oct 25, 2021 at 7:37 AM Sonal Goyal wrote: > Hi Martin, > > Agree, if you don't need the other features of MLFlow then it is likely > overkill. > > Cheers, > Sonal > https://github.com/zinggAI/zingg > > > > On Mon, Oct 25, 2021 at 4:06 PM

Re: Feature (?): Setting custom parameters for a Spark MLlib pipeline

2021-10-25 Thread Sonal Goyal
Hi Martin, Agree, if you don't need the other features of MLFlow then it is likely overkill. Cheers, Sonal https://github.com/zinggAI/zingg On Mon, Oct 25, 2021 at 4:06 PM wrote: > Hi Sonal, > > Thanks a lot for this suggestion. I presume it might indeed be possible to > use MLFlow for this

Re: Feature (?): Setting custom parameters for a Spark MLlib pipeline

2021-10-25 Thread martin
Hi Sonal, Thanks a lot for this suggestion. I presume it might indeed be possible to use MLFlow for this purpose, but at present it seems a bit too much to introduce another framework only for storing arbitrary meta-data with trained ML pipelines. I was hoping there might be a way to do this

Re: Feature (?): Setting custom parameters for a Spark MLlib pipeline

2021-10-24 Thread Sonal Goyal
Does MLFlow help you? https://mlflow.org/ I don't know if ML flow can save arbitrary key-value pairs and associate them with a model, but versioning and evaluation etc are supported. Cheers, Sonal https://github.com/zinggAI/zingg On Wed, Oct 20, 2021 at 12:59 PM wrote: > Hello, > > This is

Feature (?): Setting custom parameters for a Spark MLlib pipeline

2021-10-20 Thread martin
Hello, This is my first post to this list, so I hope I won't violate any (un)written rules. I recently started working with SparkNLP for a larger project. SparkNLP in turn is based Apache Spark's MLlib. One thing I found missing is the ability to store custom parameters in a Spark pipeline.