Re: Parallel parameter tuning: distributed execution of MLlib algorithms

2015-06-17 Thread Peter Rudenko
Hi, here's how to get Parrallel search pipleine: package org.apache.spark.ml.pipeline import org.apache.spark.ml.param.ParamMap import org.apache.spark.ml.{Pipeline, PipelineModel} import org.apache.spark.sql._ class ParralelGridSearchPipelineextends Pipeline { override def fit(dataset:

Re: Parallel parameter tuning: distributed execution of MLlib algorithms

2015-06-17 Thread Xiangrui Meng
On Fri, May 22, 2015 at 6:15 AM, Hugo Ferreira h...@inesctec.pt wrote: Hi, I am currently experimenting with linear regression (SGD) (Spark + MLlib, ver. 1.2). At this point in time I need to fine-tune the hyper-parameters. I do this (for now) by an exhaustive grid search of the step size and