Re: Dynamically change parameter list

Pat Ferrel Mon, 12 Feb 2018 15:36:06 -0800

That would be fine since the model can contain anything. But the real question 
is where you want to use those params. If you need to use them the next time 
you train, you’ll have to persist them to a place read during training. That is 
usually only the metadata store (obviously input events too), which has the 
contents of engine.json. So to get them into the metadata store you may have to 
alter engine.json.


Unless someone else knows how to alter the metadata directly after `pio train`

One problem is that you will never know what the new params are without putting 
them in a file or logging them. We keep them in a separate place and merge them 
with engine.json explicitly so we can see what is happening. They are 
calculated parameters, not hand made tunings. It seems important to me to keep 
those separate unless you are talking about some type of expected reinforcement 
learning, not really params but an evolving model.
 

On Feb 12, 2018, at 2:48 PM, Tihomir Lolić <tihomir.lo...@gmail.com> wrote:

Thank you very much for the answer. I'll try with customizing workflow. There 
is a step where Seq of models is returned. My idea is to return model and model 
parameters in this step. I'll let you know if it works.

Thanks,
Tihomie

On Feb 12, 2018 23:34, "Pat Ferrel" <p...@occamsmachete.com 
<mailto:p...@occamsmachete.com>> wrote:
This is an interesting question. As we make more mature full featured engines 
they will begin to employ hyper parameter search techniques or reinforcement 
params. This means that there is a new stage in the workflow or a feedback loop 
not already accounted for.

Short answer is no, unless you want to re-write your engine.json after every 
train and probably keep the old one for safety. You must re-train to get the 
new params put into the metastore and therefor available to your engine.

What we do for the Universal Recommender is have a special new workflow phase, 
call it a self-tuning phase, where we search for the right tuning of 
parameters. This it done with code that runs outside of pio and creates 
parameters that go into the engine.json. This can be done periodically to make 
sure the tuning is still optimal.

Not sure whether feedback or hyper parameter search is the best architecture 
for you.


From: Tihomir Lolić <tihomir.lo...@gmail.com> <mailto:tihomir.lo...@gmail.com>
Reply: user@predictionio.apache.org <mailto:user@predictionio.apache.org> 
<user@predictionio.apache.org> <mailto:user@predictionio.apache.org>
Date: February 12, 2018 at 2:02:48 PM
To: user@predictionio.apache.org <mailto:user@predictionio.apache.org> 
<user@predictionio.apache.org> <mailto:user@predictionio.apache.org>
Subject:  Dynamically change parameter list 

> Hi,
> 
> I am trying to figure out how to dynamically update algorithm parameter list. 
> After the train is finished only model is updated. The reason why I need this 
> data to be updated is that I am creating data mapping based on the training 
> data. Is there a way to update this data after the train is done?
> 
> Here is the code that I am using. The variable that and should be updated 
> after the train is marked bold red.
> 
> import io.prediction.controller.{EmptyParams, EngineParams}
> import io.prediction.data.storage.EngineInstance
> import io.prediction.workflow.CreateWorkflow.WorkflowConfig
> import io.prediction.workflow._
> import org.apache.spark.ml.linalg.SparseVector
> import org.joda.time.DateTime
> import org.json4s.JsonAST._
> 
> import scala.collection.mutable
> 
> object TrainApp extends App {
> 
>   val envs = Map("FOO" -> "BAR")
> 
>   val sparkEnv = Map("spark.master" -> "local")
> 
>   val sparkConf = Map("spark.executor.extraClassPath" -> ".")
> 
>   val engineFactoryName = "LogisticRegressionEngine"
> 
>   val workflowConfig = WorkflowConfig(
>     engineId = EngineConfig.engineId,
>     engineVersion = EngineConfig.engineVersion,
>     engineVariant = EngineConfig.engineVariantId,
>     engineFactory = engineFactoryName
>   )
> 
>   val workflowParams = WorkflowParams(
>     verbose = workflowConfig.verbosity,
>     skipSanityCheck = workflowConfig.skipSanityCheck,
>     stopAfterRead = workflowConfig.stopAfterRead,
>     stopAfterPrepare = workflowConfig.stopAfterPrepare,
>     sparkEnv = WorkflowParams().sparkEnv ++ sparkEnv
>   )
> 
>   WorkflowUtils.modifyLogging(workflowConfig.verbose)
> 
>   val dataSourceParams = DataSourceParams(sys.env.get("APP_NAME").get)
>   val preparatorParams = EmptyParams()
> 
>   val algorithmParamsList = Seq("Logistic" -> LogisticParams(columns = 
> Array[String](),
>                                                               dataMapping = 
> Map[String, Map[String, SparseVector]]()))
>   val servingParams = EmptyParams()
> 
>   val engineInstance = EngineInstance(
>     id = "",
>     status = "INIT",
>     startTime = DateTime.now,
>     endTime = DateTime.now,
>     engineId = workflowConfig.engineId,
>     engineVersion = workflowConfig.engineVersion,
>     engineVariant = workflowConfig.engineVariant,
>     engineFactory = workflowConfig.engineFactory,
>     batch = workflowConfig.batch,
>     env = envs,
>     sparkConf = sparkConf,
>     dataSourceParams = 
> JsonExtractor.paramToJson(workflowConfig.jsonExtractor, 
> workflowConfig.engineParamsKey -> dataSourceParams),
>     preparatorParams = 
> JsonExtractor.paramToJson(workflowConfig.jsonExtractor, 
> workflowConfig.engineParamsKey -> preparatorParams),
>     algorithmsParams = 
> JsonExtractor.paramsToJson(workflowConfig.jsonExtractor, algorithmParamsList),
>     servingParams = JsonExtractor.paramToJson(workflowConfig.jsonExtractor, 
> workflowConfig.engineParamsKey -> servingParams)
>   )
> 
>   val (engineLanguage, engineFactory) = 
> WorkflowUtils.getEngine(engineInstance.engineFactory, getClass.getClassLoader)
> 
>   val engine = engineFactory()
> 
>   val engineParams = EngineParams(
>     dataSourceParams = dataSourceParams,
>     preparatorParams = preparatorParams,
>     algorithmParamsList = algorithmParamsList,
>     servingParams = servingParams
>   )
> 
>   val engineInstanceId = CreateServer.engineInstances.insert(engineInstance)
> 
>   CoreWorkflow.runTrain(
>     env = envs,
>     params = workflowParams,
>     engine = engine,
>     engineParams = engineParams,
>     engineInstance = engineInstance.copy(id = engineInstanceId)
>   )
> 
>   CreateServer.actorSystem.shutdown()
> }
> 
> 
> Thank you,
> Tihomir
>

Re: Dynamically change parameter list

Reply via email to