Re: Parfor semantics

Felix Schüler Tue, 22 Nov 2016 20:25:22 -0800

I found some more issues related to parfor and opened a couple of jiras. 
Someone can assign them to me, I will work on it in!


Felix

On 22.11.2016 17:54, dusenberr...@gmail.com wrote:
> Also for some context, we're aiming to use this for remote hyperparameter 
> tuning over a large dataset.  Specifically, each remote process would train a 
> separate model over the full dataset using a mini-batch SGD approach.  Has 
> the `parfor` construct been used for this purpose before?
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Nov 22, 2016, at 2:01 PM, Matthias Boehm <mboe...@googlemail.com> wrote:
> >
> > that's a good catch - thanks Felix. It would be great if you could modify 
> > rewriteSetExecutionStategy and rewriteSetFusedDataPartitioningExecution in 
> > OptimizerConstrained to handle the respective Spark execution types. Thanks.
> >
> > Regards,
> > Matthias
> >
> >> On 11/22/2016 7:54 PM, fschue...@posteo.de wrote:
> >> The constrained optimizer doesn't seem to know about a REMOTE_SPARK
> >> execution mode and either sets CP or REMOTE_MR. I can open a jira for
> >> that and provide a fix.
> >>
> >> Felix
> >>
> >> Am 22.11.2016 02:07 schrieb Matthias Boehm:
> >>> yes, this came up several times - initially we only supported opt=NONE
> >>> where users had to specify all other parameters. Meanwhile, there is a
> >>> so-called "constrained optimizer" that does the same as the rule-based
> >>> optimizer but respects any given parameters. Please try something like
> >>> this:
> >>>
> >>> parfor (i in 1:10, opt=CONSTRAINED, par=10, mode=REMOTE_SPARK) {
> >>>     // some code here
> >>> }
> >>>
> >>>
> >>> Regards,
> >>> Matthias
> >>>
> >>>> On 11/22/2016 12:33 AM, fschue...@posteo.de wrote:
> >>>> While debugging some ParFor code it became clear that the parameters for
> >>>> parfor can be easily overwritten by the optimizer.
> >>>> One example is when I write:
> >>>>
> >>>> ```
> >>>> parfor (i in 1:10, par=10, mode=REMOTE_SPARK) {
> >>>>    // some code here
> >>>> }
> >>>> ```
> >>>>
> >>>> Depending on the data size and cluster resources, the optimizer
> >>>> (OptimizerRuleBased.java, line 844) will recognize that the work can be
> >>>> done locally and overwrite it to local execution. This might be valid
> >>>> and definitely works (in my case) but kind of contradicts what I want
> >>>> SystemML to do.
> >>>> I wonder if we should disable this optimization in case a concrete
> >>>> execution mode is given and go with the mode that is provided.
> >>>>
> >>>> Felix
> >>>>
> >>>>
> >>
> >>
>

Re: Parfor semantics

Reply via email to