Re: Dataset and select/split functionality

2017-03-03 Thread CPC
Hi Fabian,

Thank you for your explanation. Also can you give an example on how the
optimizer behaves on the assumption that the outputs of a function are
replicated?

Thank you...

On 3 March 2017 at 13:52, Fabian Hueske  wrote:

> Hi CPC,
>
> we had several requests in the past to add this features. However, adding
> select/split for DataSet is much! more work than you would expect.
> As you pointed out, we have to go through the optimizer, which assumes that
> the outputs of a function are replicated.
> This is pretty much wired in and you would have to touch a lot of code.
>
> I'm sorry, but am not comfortable doing such a big change.
> IMO, the potential gains are not worth the effort of implementation and
> verification and the risk of breaking something.
>
> Best, Fabian
>
>
>
> 2017-03-02 16:31 GMT+01:00 CPC :
>
> > Hi all,
> >
> > We will try to implement select/split functionality for batch api. We
> > looked at streaming side and understand how it works but since streaming
> > side does not include an optimizer it was easier. Since adding such a
> > runtime operator will affect optimizer layer as well, is there a part
> that
> > you want us to pay particular attention to?
> >
> > Thanks...
> >
>


Re: Dataset and select/split functionality

2017-03-03 Thread Fabian Hueske
Hi CPC,

we had several requests in the past to add this features. However, adding
select/split for DataSet is much! more work than you would expect.
As you pointed out, we have to go through the optimizer, which assumes that
the outputs of a function are replicated.
This is pretty much wired in and you would have to touch a lot of code.

I'm sorry, but am not comfortable doing such a big change.
IMO, the potential gains are not worth the effort of implementation and
verification and the risk of breaking something.

Best, Fabian



2017-03-02 16:31 GMT+01:00 CPC :

> Hi all,
>
> We will try to implement select/split functionality for batch api. We
> looked at streaming side and understand how it works but since streaming
> side does not include an optimizer it was easier. Since adding such a
> runtime operator will affect optimizer layer as well, is there a part that
> you want us to pay particular attention to?
>
> Thanks...
>


Dataset and select/split functionality

2017-03-02 Thread CPC
Hi all,

We will try to implement select/split functionality for batch api. We
looked at streaming side and understand how it works but since streaming
side does not include an optimizer it was easier. Since adding such a
runtime operator will affect optimizer layer as well, is there a part that
you want us to pay particular attention to?

Thanks...