We could use blocking issues on Jira to mark things that need to be resolved before a release.
On Sat, Sep 27, 2014 at 11:53 PM, Chesnay Schepler < chesnay.schep...@fu-berlin.de> wrote: > I agree with Kostas, and believe that postponing will imo straight up not > work since people tend to be *very* busy close to a release, even without > having to port features to several APIs. > > I furthermore don't think we will get anywhere by creating one policy to > rule them all (especially a rigid one), because there are fundamental > differences between a) the APIs b) scope of a feature; and there not being > a point in setting up a policy when it is very likely that we wont abide by > it. > > With the increasing number of API's it's quite a tall order expecting a > version for each of them from a single contributor. Even know that would be > 3 (Java, Scala, Streaming(?)) with 2 more to come in the somewhat near > future (Python, SQL (not sure if relevant)). It is a *massive *entry > barrier, as well as a major time investment on the contributors part. This > should also hold for simple features (certainly at the beginning). > > If (and only if) Scala is as thin as i am made to believe i would be for a > hard policy here. I would exclude other API`s from this. The overhead from > getting to know all API's and debugging unfamiliar code would eat up way to > much time, which could easily break our neck. It's not just about syncing > the API's, but doing so in an efficient manner. For them I would much > rather have 2-3 people per API that are somewhat responsible for porting > these features, preferably in a more concentrated effort (aka batches). > > > On 27.9.2014 21:03, Kostas Tzoumas wrote: > >> If we allow out-of-sync APIs (and backends) until the time of a release, >> aren't we just postponing the syncing problem to the time of the release, >> which is a pretty bad time to have such a problem? >> >> >> On Fri, Sep 26, 2014 at 8:49 PM, Robert Metzger <rmetz...@apache.org> >> wrote: >> >> Hi, >>> >>> I'm also in favor of having a strict policy regarding the Java and Scala >>> API. >>> In my understanding is the new Scala API a thin layer above the Java one, >>> so adding new methods should be straightforward (given that there are >>> plenty of examples as a reference). >>> >>> Robert >>> >>> On Fri, Sep 26, 2014 at 11:04 AM, Ufuk Celebi <u...@apache.org> wrote: >>> >>> Hey Fabian, >>>> >>>> thanks for bringing this up. >>>> >>>> I would vote to have a hard policy regarding the Scala and Java API as >>>> these are our main user facing APIs. >>>> >>>> If there was a fundamental problem or language feature, which could not >>>> >>> be >>> >>>> supported/ported in/to the other API, I would be OK if it was only >>>> available in one. But small additions to the APIs like outer joins, >>>> which >>>> can be in sync should also be in sync. >>>> >>>> If someone does not want to add the corresponding feature to the other >>>> APIs, I would go for a pull request with a request for someone else to >>>> >>> port >>> >>>> the missing part it. >>>> >>>> I think it is very important for users to be able to assume that all >>>> APIs >>>> have the same "power". Otherwise we might end up in a situation (and I >>>> think we already had it with the broadcast variables for a time), where >>>> users have to pick the API, which matches their use case and not their >>>> preference. >>>> >>>> Best, >>>> >>>> Ufuk >>>> >>>> On 26 Sep 2014, at 10:43, Fabian Hueske <fhue...@apache.org> wrote: >>>> >>>> Hi, >>>>> >>>>> as you all know, Flink has a layered architecture with multiple >>>>> alternatives for certain levels. >>>>> Exampels are: >>>>> - Programming APIs: Java, Scala, (and Python in progress) >>>>> - Processing Backends: distributed runtime (former Nephele), Java >>>>> Collections, (and potentially Tez in the future) >>>>> >>>>> The challenge with multiple alternatives that serve the same purpuse is >>>>> that these should be in sync. >>>>> A feature that is added to the Java API should also be added to the >>>>> >>>> Scala >>> >>>> API (and other APIs in the future). The same applies to new runtime >>>>> strategies and operators, such as outer joins. >>>>> >>>>> I think we need a policy how to keep the features of different layer >>>>> alternatives in sync. >>>>> With the recent update of the Scala API, a ScalaAPICompletenessTest was >>>>> added that checks whether the Scala API offers the same methods as the >>>>> >>>> Java >>>> >>>>> API. Adding a feature to the Java API breaks the build and requires to >>>>> either adapt the Scala API as well or exclude the added methods from >>>>> >>>> the >>> >>>> APICompletenessTest. >>>>> While this test is a great tool to make sure that that APIs are synced, >>>>> this basically requires that APIs are always synced, i.e., a >>>>> >>>> modification >>> >>>> of the Java API must go with an equivalent change of the Scala API. >>>>> If we make this a tight policy and force compatibility at all times, >>>>> contributors must know about several different technologies (Scala >>>>> >>>> Compiler >>>> >>>>> Macros, Python, the implementation details of multiple runtime >>>>> >>>> backends, >>> >>>> ...). This sounds like a huge entrance barrier to me. >>>>> >>>>> To make it clear, I am definitely in favor of keeping APIs and backends >>>>> >>>> in >>>> >>>>> sync. >>>>> However, I propose to enforce this only for releases, i.e., allow >>>>> out-of-sync APIs on the master branch and fix the APIs for releases. >>>>> With this additional requirement, we also need to think twice which >>>>> features to add as multiple components of the system will be affected. >>>>> >>>>> What do you guys think? >>>>> >>>> >>>> >