Yes, but I think we would pretty much have to do that. I don't think we can stop doing 2.11 releases.
> On 8. Oct 2018, at 15:37, Chesnay Schepler <ches...@apache.org> wrote: > > The infrastructure would only be required if we opt for releasing 2.11 and > 2.12 builds simultaneously, correct? > > On 08.10.2018 15:04, Aljoscha Krettek wrote: >> Breaking the API (or not breaking it but requiring explicit types when using >> Scala 2.12) and the Maven infrastructure to actually build a 2.12 release. >> >>> On 8. Oct 2018, at 13:00, Chesnay Schepler <ches...@apache.org> wrote: >>> >>> And the remaining parts would only be about breaking the API? >>> >>> On 08.10.2018 12:24, Aljoscha Krettek wrote: >>>> I have an open PR that does everything we can do for preparing the code >>>> base for Scala 2.12 without breaking the API: >>>> https://github.com/apache/flink/pull/6784 >>>> >>>>> On 8. Oct 2018, at 09:56, Chesnay Schepler <ches...@apache.org> wrote: >>>>> >>>>> I'd rather not maintain 2 master branches. Beyond the maintenance >>>>> overhead I'm >>>>> wondering about the benefit, as the API break still has to happen at some >>>>> point. >>>>> >>>>> @Aljoscha how much work for supporting scala 2.12 can be merged without >>>>> breaking the API? >>>>> If this is the only blocker I suggest to make the breaking change in 1.8. >>>>> >>>>> On 05.10.2018 10:31, Till Rohrmann wrote: >>>>>> Thanks Aljoscha for starting this discussion. The described problem >>>>>> brings >>>>>> us indeed a bit into a pickle. Even with option 1) I think it is somewhat >>>>>> API breaking because everyone who used lambdas without types needs to add >>>>>> them now. Consequently, I only see two real options out of the ones >>>>>> you've >>>>>> proposed: >>>>>> >>>>>> 1) Disambiguate the API (either by removing >>>>>> reduceGroup(GroupReduceFunction) or by renaming it to reduceGroupJ) >>>>>> 2) Maintain a 2.11 and 2.12 master branch until we phase 2.11 completely >>>>>> out >>>>>> >>>>>> Removing the reduceGroup(GroupReduceFunction) in option 1 is a bit >>>>>> problematic because then all Scala API users who have implemented a >>>>>> GroupReduceFunction need to convert it into a Scala lambda. Moreover, I >>>>>> think it will be problematic with RichGroupReduceFunction which you need >>>>>> to >>>>>> get access to the RuntimeContext. >>>>>> >>>>>> Maintaining two master branches puts a lot of burden onto the developers >>>>>> to >>>>>> always keep the two branches in sync. Ideally I would like to avoid this. >>>>>> >>>>>> I also played a little bit around with implicit conversions to add the >>>>>> lambda methods in Scala 2.11 on demand, but I was not able to get it work >>>>>> smoothly. >>>>>> >>>>>> I'm cross posting this thread to user as well to get some more user >>>>>> feedback. >>>>>> >>>>>> Cheers, >>>>>> Till >>>>>> >>>>>> On Thu, Oct 4, 2018 at 7:36 PM Elias Levy <fearsome.lucid...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> The second alternative, with the addition of methods that take functions >>>>>>> with Scala types, seems the most sensible. I wonder if there is a need >>>>>>> then to maintain the *J Java parameter methods, or whether users could >>>>>>> just >>>>>>> access the functionality by converting the Scala DataStreams to Java via >>>>>>> .javaStream and whatever the equivalent is for DataSets. >>>>>>> >>>>>>> On Thu, Oct 4, 2018 at 8:10 AM Aljoscha Krettek <aljos...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm currently working on >>>>>>> https://issues.apache.org/jira/browse/FLINK-7811, >>>>>>>> with the goal of adding support for Scala 2.12. There is a bit of a >>>>>>> hurdle >>>>>>>> and I have to explain some context first. >>>>>>>> >>>>>>>> With Scala 2.12, lambdas are implemented using the lambda mechanism of >>>>>>>> Java 8, i.e. Scala lambdas are now SAMs (Single Abstract Method). This >>>>>>>> means that the following two method definitions can both take a lambda: >>>>>>>> >>>>>>>> def map[R](mapper: MapFunction[T, R]): DataSet[R] >>>>>>>> def map[R](fun: T => R): DataSet[R] >>>>>>>> >>>>>>>> The Scala compiler gives precedence to the lambda version when you call >>>>>>>> map() with a lambda in simple cases, so it works here. You could still >>>>>>> call >>>>>>>> map() with a lambda if the lambda version of the method weren't here >>>>>>>> because they are now considered the same. For Scala 2.11 we need both >>>>>>>> signatures, though, to allow calling with a lambda and with a >>>>>>> MapFunction. >>>>>>>> The problem is with more complicated method signatures, like: >>>>>>>> >>>>>>>> def reduceGroup[R](fun: (scala.Iterator[T], Collector[R]) => Unit): >>>>>>>> DataSet[R] >>>>>>>> >>>>>>>> def reduceGroup[R](reducer: GroupReduceFunction[T, R]): DataSet[R] >>>>>>>> >>>>>>>> (for reference, GroupReduceFunction is a SAM with void >>>>>>>> reduce(java.lang.Iterable<T> values, Collector<O> out)) >>>>>>>> >>>>>>>> These two signatures are not the same but similar enough for the Scala >>>>>>>> 2.12 compiler to "get confused". In Scala 2.11, I could call >>>>>>> reduceGroup() >>>>>>>> with a lambda that doesn't have parameter type definitions and things >>>>>>> would >>>>>>>> be fine. With Scala 2.12 I can't do that because the compiler can't >>>>>>> figure >>>>>>>> out which method to call and requires explicit type definitions on the >>>>>>>> lambda parameters. >>>>>>>> >>>>>>>> I see some solutions for this: >>>>>>>> >>>>>>>> 1. Keep the methods as is, this would force people to always explicitly >>>>>>>> specify parameter types on their lambdas. >>>>>>>> >>>>>>>> 2. Rename the second method to reduceGroupJ() to signal that it takes a >>>>>>>> user function that takes Java-style interfaces (the first parameter is >>>>>>>> java.lang.Iterable while the Scala lambda takes a scala.Iterator). This >>>>>>>> disambiguates the code, users can use lambdas without specifying >>>>>>>> explicit >>>>>>>> parameter types but breaks the API. >>>>>>>> >>>>>>>> One effect of 2. would be that we can add a reduceGroup() method that >>>>>>>> takes a api.scala.GroupReduceFunction that takes proper Scala types, >>>>>>>> thus >>>>>>>> it would allow people to implement user functions without having to >>>>>>>> cast >>>>>>>> the various Iterator/Iterable parameters. >>>>>>>> >>>>>>>> Either way, people would have to adapt their code when moving to Scala >>>>>>>> 2.12 in some way, depending on what style of methods they use. >>>>>>>> >>>>>>>> There is also solution 2.5: >>>>>>>> >>>>>>>> 2.5 Rename the methods only in the Scala 2.12 build of Flink and keep >>>>>>>> the >>>>>>>> old method names for Scala 2.11. This would require some infrastructure >>>>>>> and >>>>>>>> I don't yet know how it can be done in a sane way. >>>>>>>> >>>>>>>> What do you think? I personally would be in favour of 2. but it breaks >>>>>>> the >>>>>>>> existing API. >>>>>>>> >>>>>>>> Best, >>>>>>>> Aljoscha >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >> >