Re: [DISCUSS] Policy on keeping layer alternatives in sync

Robert Metzger Fri, 26 Sep 2014 11:50:27 -0700

Hi,

I'm also in favor of having a strict policy regarding the Java and Scala
API.
In my understanding is the new Scala API a thin layer above the Java one,
so adding new methods should be straightforward (given that there are
plenty of examples as a reference).


Robert

On Fri, Sep 26, 2014 at 11:04 AM, Ufuk Celebi <u...@apache.org> wrote:

> Hey Fabian,
>
> thanks for bringing this up.
>
> I would vote to have a hard policy regarding the Scala and Java API as
> these are our main user facing APIs.
>
> If there was a fundamental problem or language feature, which could not be
> supported/ported in/to the other API, I would be OK if it was only
> available in one. But small additions to the APIs like outer joins, which
> can be in sync should also be in sync.
>
> If someone does not want to add the corresponding feature to the other
> APIs, I would go for a pull request with a request for someone else to port
> the missing part it.
>
> I think it is very important for users to be able to assume that all APIs
> have the same "power". Otherwise we might end up in a situation (and I
> think we already had it with the broadcast variables for a time), where
> users have to pick the API, which matches their use case and not their
> preference.
>
> Best,
>
> Ufuk
>
> On 26 Sep 2014, at 10:43, Fabian Hueske <fhue...@apache.org> wrote:
>
> > Hi,
> >
> > as you all know, Flink has a layered architecture with multiple
> > alternatives for certain levels.
> > Exampels are:
> > - Programming APIs: Java, Scala, (and Python in progress)
> > - Processing Backends: distributed runtime (former Nephele), Java
> > Collections, (and potentially Tez in the future)
> >
> > The challenge with multiple alternatives that serve the same purpuse is
> > that these should be in sync.
> > A feature that is added to the Java API should also be added to the Scala
> > API (and other APIs in the future). The same applies to new runtime
> > strategies and operators, such as outer joins.
> >
> > I think we need a policy how to keep the features of different layer
> > alternatives in sync.
> > With the recent update of the Scala API, a ScalaAPICompletenessTest was
> > added that checks whether the Scala API offers the same methods as the
> Java
> > API. Adding a feature to the Java API breaks the build and requires to
> > either adapt the Scala API as well or exclude the added methods from the
> > APICompletenessTest.
> > While this test is a great tool to make sure that that APIs are synced,
> > this basically requires that APIs are always synced, i.e., a modification
> > of the Java API must go with an equivalent change of the Scala API.
> > If we make this a tight policy and force compatibility at all times,
> > contributors must know about several different technologies (Scala
> Compiler
> > Macros, Python, the implementation details of multiple runtime backends,
> > ...). This sounds like a huge entrance barrier to me.
> >
> > To make it clear, I am definitely in favor of keeping APIs and backends
> in
> > sync.
> > However, I propose to enforce this only for releases, i.e., allow
> > out-of-sync APIs on the master branch and fix the APIs for releases.
> > With this additional requirement, we also need to think twice which
> > features to add as multiple components of the system will be affected.
> >
> > What do you guys think?
>
>

Re: [DISCUSS] Policy on keeping layer alternatives in sync

Reply via email to