Re: Flink on Tez

Sean Owen Sun, 09 Nov 2014 03:10:54 -0800

This was kind of the substance of the same conversation that happened
about Spark on Tez, which looks like it was rejected
(https://issues.apache.org/jira/browse/SPARK-3561):

- Committing to an SPI interface is hard and imposes its own design
and runtime limitations
- Fragments / forks efforts across two implementations
- If execution engine A is good at one thing and B at another, then
you force a choice that users have difficulty making, when you could
try to improve either engine to do both
- (And in particular, the YARN elasticity thing was in theory being
improved already for Spark anyway, so wasn't as compelling)

Of course you could make that argument against any abstraction, but
the execution engine is so core to the point of these projects that
the logic may be different. It's not the same as an execution engine
abstracting over different data layers.

Here, maybe the SPI interface is already there and committed-to, and
there are real benefits to Tez as an alternative. I do think it will
be helpful to further articulate when you would want to use one over
the other, as even I have the same broad question. If they're the
same, what's the point? if they're not, when are they different -- is
it really an issue of resource elasticity vs maturity of the
integration? makes sense to me.

On Sun, Nov 9, 2014 at 10:50 AM, Flavio Pompermaier
<pomperma...@okkam.it> wrote:
> Thanks Kostas fir the reply.
> I've already read your first post and what is not fully clear to me is the
> technical motivation of. "Tez follows design choices that are geared
> towards resource elasticity, whereas the design choices behind Flink's
> engine are geared more towards low latency querying and iterative
> processing".  Since in the future I could have to choose one of the 2 I'd
> like to better understand the pros and the cons of the 2 runtimes.
>
> Thanks in advance,
> Flavio
> On Nov 9, 2014 11:02 AM, "Kostas Tzoumas" <ktzou...@apache.org> wrote:
>
>> Flavio, see my first post in this thread.
>>
>> If these differences are fine print for your application/requirements, then
>> yes, both backends will do the same thing (distribute computation). In that
>> case, the suggestion would be to use Flink in the normal way, as it is
>> much more mature implementation-wise than Flink-on-Tez.

Re: Flink on Tez

Reply via email to